Welcome to The Coded Message! on The Coded Messagehttps://www.thecodedmessage.com/2024-03-14T22:59:48-04:00Jimmy HartzellSorting Polymorphically in Many Languageshttps://www.thecodedmessage.com/posts/polymorphism-many-languages/2024-02-05T00:00:00+00:00Polymorphism is a powerful programming language feature. In polymorphism, we have generic functions that don’t know exactly what type of data they will be operating on. Often, the data types won’t even all have been designed yet when the generic function is written. The generic function provides the general outline of the work, but the details of some parts of the work, some specific operations, must be tailored to the specific types being used.<p>Polymorphism is a powerful programming language feature. In polymorphism,
we have generic functions that don’t know exactly what type of data
they will be operating on. Often, the data types won’t even all have been
designed yet when the generic function is written. The generic function
provides the general outline of the work, but the details of some parts of
the work, some specific operations, must be tailored to the specific
types being used. The generic code needs some way of accessing these
specific operations, and the users of the generic code need some way
of specifying them.</p>
<p>There are many use cases for polymorphism. When sorting an array,
the algorithm will need to be adapted to the specific element type,
so it knows how to compare elements. When drawing virtual objects on
a screen, an algorithm might choose where to put each object and which
objects to draw, whereas each type of object might have its own specialized
implementation of how to draw it.</p>
<p>These are just two examples among many. Most complicated projects
have many polymorphic functions. Even in languages that don’t support
polymorphism directly, there are usually ways of building it out of
existing primitives.</p>
<p>The example I’ve chosen is sorting, specifically sorting an array
or vector. It’s just an example; a lot of what I say applies generally
to how polymorphism works in that programming language.</p>
<p>This is a good example, as sorting is a function where it’s really
obvious where polymorphism is required to get a properly generalizable
algorithm. A lot of discussions of polymorphism invent contrived
situations where polymorphism seems overkill, and I think that’s
fundamentally confusing.</p>
<p>On the other hand, it’s a bad example in some ways, because it only makes
sense in the context of a homogeneous array or list, where every element
is the same type. This is a bad example because heterogeneous containers,
where every element has a different type and the polymorphic function
has to look up as many function implementations as there are elements,
provides a very different set of problems to solve.</p>
<p>This is especially important as Rust and C++ both provide two types of
polymorphism, compile-time and run-time, also known as static and dynamic.
The question of which to use is complicated, but for sorting, compile-time
or static polymorphism is clearly the appropriate choice, with run-time
or dynamic polymorphism feeling very awkward and forced. Heterogeneous
containers generally must use some form of dynamic polymorphism (whether
through virtual functions in C++ or through type erasure).</p>
<p>So, while I think this example will be illustrative, it won’t allow us
to explore run-time, dynamic polymorphism on its home turf, if you will.
Hopefully, I can make up this deficit in future blog posts.</p>
<h1 id="sorting-a-polymorphic-function">Sorting: A Polymorphic Function</h1>
<p>Sorting algorithms are a true use case for polymorphism: rather than
distinguishing between a small set of options, many types support
the operations necessary for sorting. The algorithm is agnostic to the
implementation of those operations. Quick sort, insertion sort, and merge
sort apply equally well to sorting integers, floating point values,
or alphabetizing strings – any algorithm can be combined freely with
any type, or at least any type for which a concept of “ordering” exists.</p>
<p>Here are the operations or properties (or dare I say, <em>traits</em>) that a
type needs to be sortable, and that a generic sorting algorithm might
need to find out about. The first one is obvious to OOP programmers, but
the other two more subtle, and implied in many OOP programming languages:</p>
<ul>
<li><strong>Ordering</strong> or <strong>comparison</strong>: Given two values <em>a</em> and <em>b</em>,
this operation answers which is greater, or determines that they
are equal. Some types have the additional possibility that they are
incomparable – arrays of those types cannot be sorted by most algorithms.</li>
<li><strong>Swapping</strong> or <strong>moving</strong>: The data has to be able to be moved around
to turn the unsorted array into a sorted one. This is automatic in many
OOP languages for object types due to ubiquitous use of indirection. It
is also automatic in Rust, where every type can be moved by just
copying all the bytes.</li>
<li><strong>Striding the array</strong> or <strong>size</strong>: Given a pointer to one element,
how do you get to the next one? By how many bytes must you increment the
pointer? Most sorting algorithms require this to be constant. If you
use indirection for the values, this is also trivial. If you do not,
it is key information.</li>
</ul>
<p>These operations – or more generally, <em>traits</em> of a type – can then be
combined with a sorting algorithm to create a concrete procedure to
sort an array for a given concrete type.</p>
<p>So let’s see how various programming languages handle this.</p>
<h1 id="programming-language-0-sorting-in-c">Programming Language #0: Sorting in C</h1>
<p>I will start our tour of programming languages with C. C – the non-OOP,
non-C++ programming language; the classic “portable assembly language”
from 1972 – doesn’t have many polymorphic algorithms, algorithms that
accept any type, because you have to implement polymorphism by hand. But
sorting is an important enough one that standard C does have a generic
sorting function: <code>qsort</code> for quicksort (and on many systems, <code>heapsort</code>
and mergesort` are also avaialble). Because polymorphism is implemented
by hand, we can look at this function to see how one might specifically
tailor polymorphism to the problem of sorting.</p>
<p>Here is the function signature for <code>qsort</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">qsort</span>(<span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>base, <span style="color:#66d9ef">size_t</span> nmemb, <span style="color:#66d9ef">size_t</span> size,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> (<span style="color:#f92672">*</span>compar)(<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>, <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>));
</span></span></code></pre></div><p>It can be used to sort blocks of memory containing a sequence of integers,
foating point values, or (pointers to) strings – any comparable and
(trivially) movable fixed-size type.</p>
<p>C function signatures can be hard to read, so I’ll break it down
argument by argument:</p>
<ul>
<li><code>void *base</code>: This is an untyped pointer (<code>void *</code>) to the beginning
of the block of memory to be sorted.</li>
<li><code>size_t nmemb</code>: This is a bound, how much memory is contained in
the block of memory. C often represents aggregates by two values,
base and a count of the members.</li>
<li><code>size_t size</code>: How big is each member? On a typical 64-bit system, an
<code>int</code> is 4 bytes, a <code>double</code> is 8, and <code>char *</code> for strings are 8 bytes.
Custom types might be any size. <code>qsort</code> should work for all of these
types, without indirection.</li>
<li><code>int (*compar)(const void *, const void *)</code>: This is the interesting
part. This is a function pointer for the comparison operation as
discussed above. You write a function that takes two pointers to
two elements, and returns a value that encodes their relationship.</li>
</ul>
<p>Swapping is assumed to be byte-by-byte, and so <code>size</code> covers the
last two attributes of the type listed above. The key one here
is <code>compar</code>, a bit of code that <code>qsort</code> has to call to do an
operation specific to your type, a small policy injection that
adapts a generic algorithm to your particular type.</p>
<p>The return value of <code>compar</code> is an <code>int</code>, but it is interpreted according
to a C convention, shared with (for example) the string comparison
function <code>strcmp</code>. For <code>a ? b</code>, a return value <code>r</code> is interpreted thus:</p>
<ul>
<li>if <code>r < 0</code>, <code>a < b</code></li>
<li>if <code>r > 0</code>, <code>a > b</code></li>
<li>if <code>r == 0</code>, <code>a == b</code></li>
</ul>
<p>So, here’s a complete C program that sorts its command line arguments –
including the program name:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e"><stdio.h></span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e"><stdlib.h></span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e"><string.h></span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">compare_strings</span>(<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>a, <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>b) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// `a` and `b` are pointers to the element type, which in
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// this case is `char *`. Thus they are `char **`.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">//
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// Nothing is stopping you from getting this wrong and putting
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// `char *` instead -- it will just silently not work. The
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// compiler can and will make you write `const` in the right
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// place, though.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span> <span style="color:#66d9ef">const</span><span style="color:#f92672">*</span> a_str_ptr <span style="color:#f92672">=</span> a;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span> <span style="color:#66d9ef">const</span><span style="color:#f92672">*</span> b_str_ptr <span style="color:#f92672">=</span> b;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// `strcmp` uses the same convention as `qsort` for comparison.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">return</span> <span style="color:#a6e22e">strcmp</span>(<span style="color:#f92672">*</span>a_str_ptr, <span style="color:#f92672">*</span>b_str_ptr);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">main</span>(<span style="color:#66d9ef">int</span> argc, <span style="color:#66d9ef">char</span> <span style="color:#f92672">**</span>argv) {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">qsort</span>(argv, argc, <span style="color:#66d9ef">sizeof</span>(<span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>), <span style="color:#f92672">&</span>compare_strings);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">int</span> i <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>; i <span style="color:#f92672"><</span> argc; i<span style="color:#f92672">++</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">printf</span>(<span style="color:#e6db74">"%s</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">"</span>, argv[i]);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>But the same <code>qsort</code> function can also be used to sort integers, if given
different parameters and a different comparison function:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e"><stdlib.h></span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e"><stdio.h></span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">compare_ints</span>(<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>a_vp, <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>b_vp) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">int</span> <span style="color:#f92672">*</span>a_ip <span style="color:#f92672">=</span> a_vp;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">int</span> <span style="color:#f92672">*</span>b_ip <span style="color:#f92672">=</span> b_vp;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> a <span style="color:#f92672">=</span> <span style="color:#f92672">*</span>a_ip;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> b <span style="color:#f92672">=</span> <span style="color:#f92672">*</span>b_ip;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (a <span style="color:#f92672"><</span> b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span> } <span style="color:#66d9ef">else</span> <span style="color:#66d9ef">if</span> (a <span style="color:#f92672">==</span> b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span> } <span style="color:#66d9ef">else</span> { <span style="color:#75715e">// a > b
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">return</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">main</span>(<span style="color:#66d9ef">int</span> argc, <span style="color:#66d9ef">char</span> <span style="color:#f92672">**</span>argv) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> intary[<span style="color:#ae81ff">10</span>] <span style="color:#f92672">=</span> { <span style="color:#ae81ff">10</span>, <span style="color:#ae81ff">9</span>, <span style="color:#ae81ff">8</span>, <span style="color:#ae81ff">7</span>, <span style="color:#ae81ff">6</span>, <span style="color:#ae81ff">5</span>, <span style="color:#ae81ff">4</span>, <span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">2</span>, <span style="color:#ae81ff">1</span> };
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">qsort</span>(intary, <span style="color:#ae81ff">10</span>, <span style="color:#66d9ef">sizeof</span>(<span style="color:#66d9ef">int</span>), <span style="color:#f92672">&</span>compare_ints);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">int</span> i <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>; i <span style="color:#f92672"><</span> <span style="color:#ae81ff">10</span>; i<span style="color:#f92672">++</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">printf</span>(<span style="color:#e6db74">"%d</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">"</span>, intary[i]);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p><code>qsort</code> implements a form of manual run-time polymorphism, in a
programming language with no built-in support for polymorphism.
It behaves differently based on the element type, as passed to it via a
variety of arguments. One of the traits – the comparison operator –
differs between types in a way that requires custom code, and this is
passed in via pointer. <code>qsort</code> then invokes the operation via indirect
function call, the same mechanism that is used for polymorphism
in OOP. But unlike OOP-style runtime polymorphism, there is just one
function pointer for all the items, rather than each item coming with
its own “vtable.”</p>
<p>Note that the optimizer is not able to eliminate this indirect call,
especially in the <code>qsort</code> example, where the sorting function is in the
standard library, whereas the function calling it and the comparison
function are both in application code. This comes at a performance cost,
which means that if you’re programming C and the performance of this
particular sort is essential to your program, it might easily make sense
to write custom sorting code that is not polymorphic.</p>
<h1 id="programming-language-1-sorting-in-java">Programming Language #1: Sorting in Java</h1>
<p>Java is about as far from C as you can get in this matter. C provides no
abstraction or language features specifically for polymorphism, and in
<code>qsort</code> we use a low-level tool it does provide – function pointers –
to build it ourselves. In Java, however, the programming language is
explicitly object-oriented, and so the whole programming language is
designed to encourage you to leverage polymorphism, as that is one of
the pillars of object-oriented programming.</p>
<p>The version of polymorphism available in Java is dynamic, run-time,
“late binding” polymorphism, the type of polymorphism that OOP favors.
It is based off of the idea of overriding methods, either from base
classes, or interfaces that a custom type (a “class”) can implement.</p>
<p>As I mentioned before, this is not the best match for the problem of
sorting, at least not the type of sorting we’re talking about. Run-time
polymorphism means that every individual element could potentially have
a different comparison procedure, which is unlikely. The possibility
of such a thing happen increases the cognitive load.</p>
<p>Nevertheless, Java does support polymorphic sorting, and it’s useful
to discuss specifically because it does show how OOP-style polymorphism
works when applied to such a problem.</p>
<p>There are many methods that do sorting in Java. <a href="https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Arrays.html#sort%28java.lang.Object%5B%5D,java.util.Comparator%29">Some of
them</a>
take an explicit argument to convey how to do comparisons, just like the
<code>qsort</code> example. But more commonly, we sort according to what Java refers
to as the “natural order” of the elements, as (for example) in
<a href="https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Collections.html#sort%28java.util.List%29">this overload</a>
of <code>Collections.sort</code>, with the following signature:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#f92672"><</span>T <span style="color:#66d9ef">extends</span> Comparable<span style="color:#f92672"><?</span> <span style="color:#66d9ef">super</span> T<span style="color:#f92672">>></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>List<span style="color:#f92672"><</span>T<span style="color:#f92672">></span> list<span style="color:#f92672">)</span>
</span></span></code></pre></div><p>This sorts a list of elements of type <code>T</code>, where “list” in Java can refer
to any of a number of collections that store data in order, such as in
a single allocated array (<code>ArrayList</code>) or a linked list (<code>LinkedList</code>).
Therefore, it is not only polymorphic in how to compare the elements,
but also in how to navigate through the list.</p>
<p>It needs to know about the same traits of type <code>T</code> that <code>qsort</code> does.
Some are not polymorphic: for this method to make sense, we know that
<code>T</code> must be a reference type, that it must be boxed (that is, it must
use indirection), and that therefore the size of an element is always
the natural pointer size of the platform, and swapping the element only
involves swapping the pointers.</p>
<p>But there’s no getting around the polymorphism of comparisons, and so
we see this strange annotation on the function signature: <code><T extends Comparable<? super T>></code>. This indicates that <code>T</code> must implement
the interface <code>Comparable</code> – implement in this context is called
<code>extends</code>. Specifically, it must implement that interface in such a way
that it can be applied to other elements of type <code>T</code> (which means that
it uses <code>T</code> or some “supertype” of <code>T</code>).</p>
<p>The notation is complicated, because the semantics are complicated.
Technically, <code>T</code> could be comparable to a parent type of <code>T</code>, and that
would still work. In fact, <code>T</code> could refer to an entire class hierarchy
of types derived from some base class, all of them comparable in different
ways to objects elsewhere in the hierarchy and to objects derived from
a yet further base class. Objects of type <code>T</code> could even be comparable
to any arbitrary object – and all of this is covered in
<code><T extends Comparable<? superT>></code>, trying to express at compile-time
what will cause the type <code>T</code> to be a reasonable type to use for sorting.</p>
<p>But this is all just an extra check that the compiler can do at
compile-time to prevent run-time errors, because all of the information
on how to do the comparisons is available at run-time. In fact, <a href="https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Arrays.html#sort%28java.lang.Object%5B%5D%29">other
methods</a>
don’t use such formal prerequisites at all, preferring
to query at run-time for appropriate interfaces, throwing
an exception if they are not present.</p>
<p>In all of these cases, the comparison is the “natural
ordering,” which is defined to mean that comparison is
done through a Java interface. Specifically, these methods use the
<a href="https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Comparable.html"><code>Comparable</code></a>
interface, which specifies a method, <code>compareTo</code>, which must take an
implicit <code>this</code> parameter and an explicit parameter of the type being
compared to, and, like the comparison functions in <code>qsort</code>, must then
return an integer whose sign indicates whether the first value was
greater or the second (with zero indicating equality).</p>
<p>This natural ordering is defined on a per-type basis. Each type can
only implement <code>Comparable</code> once. Fortunately, the regular built-in
types, all the ones we are likely to use, all come with good natural
orderings. For example, this code all works:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#f92672">import</span> java.util.*<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Sort</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">main</span><span style="color:#f92672">(</span>String<span style="color:#f92672">[]</span> args<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> List<span style="color:#f92672"><</span>String<span style="color:#f92672">></span> argList <span style="color:#f92672">=</span> Arrays<span style="color:#f92672">.</span><span style="color:#a6e22e">asList</span><span style="color:#f92672">(</span>args<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> Collections<span style="color:#f92672">.</span><span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>argList<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#f92672">(</span>String arg <span style="color:#f92672">:</span> argList<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> System<span style="color:#f92672">.</span><span style="color:#a6e22e">out</span><span style="color:#f92672">.</span><span style="color:#a6e22e">println</span><span style="color:#f92672">(</span>arg<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> List<span style="color:#f92672"><</span>Integer<span style="color:#f92672">></span> list <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> ArrayList<span style="color:#f92672"><</span>Integer<span style="color:#f92672">>();</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span>1<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span>3<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span>2<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span>4<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> Collections<span style="color:#f92672">.</span><span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>list<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#f92672">(</span><span style="color:#66d9ef">int</span> i <span style="color:#f92672">:</span> list<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> System<span style="color:#f92672">.</span><span style="color:#a6e22e">out</span><span style="color:#f92672">.</span><span style="color:#a6e22e">println</span><span style="color:#f92672">(</span>i<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>See it in use:</p>
<pre tabindex="0"><code>$ java Sort b c a
a
b
c
1
2
3
4
$
</code></pre><p>It gets a little less coherent when we mix different types of object
in the same list, which Java lets us represent in the type system
by using <code>Object</code>, which is a type that can store a reference to any
non-primitive (including boxed primitives):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#f92672">import</span> java.util.*<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Sort</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">main</span><span style="color:#f92672">(</span>String<span style="color:#f92672">[]</span> args<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> List<span style="color:#f92672"><</span>Object<span style="color:#f92672">></span> list <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> ArrayList<span style="color:#f92672"><</span>Object<span style="color:#f92672">>();</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span>1<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span><span style="color:#e6db74">"Hi"</span><span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> Collections<span style="color:#f92672">.</span><span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>list<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#f92672">(</span>Object i <span style="color:#f92672">:</span> list<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> System<span style="color:#f92672">.</span><span style="color:#a6e22e">out</span><span style="color:#f92672">.</span><span style="color:#a6e22e">println</span><span style="color:#f92672">(</span>i<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>While the Java runtime allows us to create such a collection, the
type system does not allow us to use <code>Collections.sort</code> to sort it,
as <code>Object</code> does not provide us enough information to make sure these
elements properly can be compared to each other (which in fact, they
cannot, as comparing strings to integers is not defined in Java’s
“natural ordering”):</p>
<pre tabindex="0"><code>$ javac Sort.java
Sort.java:9: error: no suitable method found for sort(List<Object>)
Collections.sort(list);
^
method Collections.<T#1>sort(List<T#1>) is not applicable
(inference variable T#1 has incompatible bounds
equality constraints: Object
lower bounds: Comparable<? super T#1>)
method Collections.<T#2>sort(List<T#2>,Comparator<? super T#2>) is not applicable
(cannot infer type-variable(s) T#2
(actual and formal argument lists differ in length))
where T#1,T#2 are type-variables:
T#1 extends Comparable<? super T#1> declared in method <T#1>sort(List<T#1>)
T#2 extends Object declared in method <T#2>sort(List<T#2>,Comparator<? super T#2>)
1 error
$
</code></pre><p>So how does this work? What is a Java interface? What are its advantages
or disadvantages?</p>
<p>Well, Java has two types of values: primitives on the one hand, and
object references on the other. In order to use interfaces, or polymorphism
at all, we must be dealing with objects. For primitives, there are separate
methods for sorting various types of arrays in the <code>Arrays</code> class. As
primitives cannot be stored directly in collections, <code>Collections</code>
doesn’t have to deal with them.</p>
<p>So, to use this polymorphism through interfaces, we must be dealing with
objects. Objects in Java are a rich, standardized data structure, which
is why it’s possible to query at run-time which interfaces an object
supports. Objects contain not just the fields that the Java programmer
specifies, but additional metadata that includes implementations of
any supported interfaces, including <code>Comparable</code>. That metadata can be
used to find the right version of the <code>compareTo</code> method to use to sort
objects of type <code>T</code>. Once we have a <code>T</code>, we can query it at run-time
to find the <code>compareTo</code> method. Theoretically, Java might query every
object separately as it sorts, with a separate query for each comparison,
although I trust that modern Java will in many cases realize that the
method will be the same for each object, and figure out a way to optimize
it out.</p>
<p>As a programmer of a type, we simply declare at the top of the class that
our type <code>Foo</code>, for example, <code>implements Comparable<Foo></code>, and then lower
down include our implementation of <code>compareTo</code> among our methods with the
<code>override</code> keyword. Based on that, <code>Foo</code> objects will be created with
the correct metadata such that Java will know to use that method for
comparison when sorting, whether the type is known at compile-time
or at run-time. We can implement our own version of <code>compareTo</code>
that has a different type than the typical “natural ordering” one
would expect from the state that is contained in a <code>Foo</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#f92672">import</span> java.util.*<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Sort</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">private</span> <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Foo</span> <span style="color:#66d9ef">implements</span> Comparable<span style="color:#f92672"><</span>Foo<span style="color:#f92672">></span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> inner<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> <span style="color:#a6e22e">Foo</span><span style="color:#f92672">(</span><span style="color:#66d9ef">int</span> inner<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">this</span><span style="color:#f92672">.</span><span style="color:#a6e22e">inner</span> <span style="color:#f92672">=</span> inner<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#a6e22e">@Override</span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">int</span> <span style="color:#a6e22e">compareTo</span><span style="color:#f92672">(</span>Foo foo<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Less and greater are swapped by this compared to int
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// comparison
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">if</span> <span style="color:#f92672">(</span>foo<span style="color:#f92672">.</span><span style="color:#a6e22e">inner</span> <span style="color:#f92672">></span> <span style="color:#66d9ef">this</span><span style="color:#f92672">.</span><span style="color:#a6e22e">inner</span><span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> 1<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span> <span style="color:#66d9ef">else</span> <span style="color:#66d9ef">if</span> <span style="color:#f92672">(</span>foo<span style="color:#f92672">.</span><span style="color:#a6e22e">inner</span> <span style="color:#f92672"><</span> <span style="color:#66d9ef">this</span><span style="color:#f92672">.</span><span style="color:#a6e22e">inner</span><span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#f92672">-</span>1<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span> <span style="color:#66d9ef">else</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> 0<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> String <span style="color:#a6e22e">toString</span><span style="color:#f92672">()</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#e6db74">""</span> <span style="color:#f92672">+</span> inner<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">main</span><span style="color:#f92672">(</span>String<span style="color:#f92672">[]</span> args<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> List<span style="color:#f92672"><</span>Foo<span style="color:#f92672">></span> list <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> ArrayList<span style="color:#f92672"><</span>Foo<span style="color:#f92672">>();</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span><span style="color:#66d9ef">new</span> Foo<span style="color:#f92672">(</span>3<span style="color:#f92672">));</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span><span style="color:#66d9ef">new</span> Foo<span style="color:#f92672">(</span>4<span style="color:#f92672">));</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span><span style="color:#66d9ef">new</span> Foo<span style="color:#f92672">(</span>1<span style="color:#f92672">));</span>
</span></span><span style="display:flex;"><span> list<span style="color:#f92672">.</span><span style="color:#a6e22e">add</span><span style="color:#f92672">(</span><span style="color:#66d9ef">new</span> Foo<span style="color:#f92672">(</span>2<span style="color:#f92672">));</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> Collections<span style="color:#f92672">.</span><span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>list<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#f92672">(</span>Object i <span style="color:#f92672">:</span> list<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> System<span style="color:#f92672">.</span><span style="color:#a6e22e">out</span><span style="color:#f92672">.</span><span style="color:#a6e22e">println</span><span style="color:#f92672">(</span>i<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>Here is the output:</p>
<pre tabindex="0"><code>$ java Sort
4
3
2
1
$
</code></pre><p>Built-in types such as <code>String</code> and <code>Integer</code> already provide
their own <code>compareTo</code> override methods, corresponding to more
typical implementations of comparisons. Only the author of each
type can provide information on how the types are to be compared
in this way. To get around this, you can use a wrapper type for
each element (like <code>Foo</code>), or you have to fall back on passing in
the comparison function the old-fashioned way, like in <code>qsort</code> –
though in Java passing in a function is accomplished here through
yet another interface, <code>Comparator</code>, as in <a href="https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Collections.html#sort%28java.util.List,java.util.Comparator%29">this alternative
function</a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#f92672"><</span>T<span style="color:#f92672">></span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>List<span style="color:#f92672"><</span>T<span style="color:#f92672">></span> list<span style="color:#f92672">,</span>
</span></span><span style="display:flex;"><span> Comparator<span style="color:#f92672"><?</span> <span style="color:#66d9ef">super</span> T<span style="color:#f92672">></span> c<span style="color:#f92672">)</span>
</span></span></code></pre></div><p>Here, <code>Comparator</code> is effectively a function pointer with context, but
it’s expressed as an interface so that you can write a concrete class
that implements the desired function. Fundamentally, Rust and C++
do something similar.</p>
<p>So, how are we to evaluate this system? It’s not particularly designed
for situations like sorting. The run-time system is built for the
heterogeneous containers, where each individual element of a collection
might have a different opinion on how to compare itself to the others.
The amount of run-time flexibility is overkill to the situation.</p>
<p>Rather than providing one sorting function pointer, as in the C example,
each object comes with its own infrastructure for finding out how to
not only sort, but do every other thing that Java might want to do
polymorphically with that object, such as convert it to a string, or
hash. While the infrastructure is well-optimized and performant for the
assumption of heavy use of OOP-style polymorphism, it clearly doesn’t hold
to the C++ or Rust performance ideals of not paying for what you don’t
use, instead opting to pay an up-front cost under the assumption that
any and all objects will regularly be used polymorphically, in OOP style.</p>
<p>The type system in Java is conceptualized as a way of preventing
errors, a layer of safety on top of a more Smalltalk-like natural
OOP state. In Smalltalk any method can be invoked on any object,
and it’s simply a run-time error if that method isn’t available. In
Java, the types form a more rigorous layer to check to make
sure our method calls have correct semantics, allowing errors
to be caught earlier, at compile-time (although Java type errors
are also sometimes caught at run-time). The power of the more ideologically
pure form of OOP is still available in Java, as evidenced by the
signature on the <code>Arrays.sort</code> method alluded to above (and <a href="https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/Arrays.html#sort%28java.lang.Object%5B%5D%29">documented
here</a>.
It is deprecated, but still possible:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>Object<span style="color:#f92672">[]</span> a<span style="color:#f92672">)</span>
</span></span></code></pre></div><p>Here is a use case that succeeds:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#f92672">import</span> java.util.*<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Sort</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">main</span><span style="color:#f92672">(</span>String<span style="color:#f92672">[]</span> args<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> Arrays<span style="color:#f92672">.</span><span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>args<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#f92672">(</span>String arg <span style="color:#f92672">:</span> args<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> System<span style="color:#f92672">.</span><span style="color:#a6e22e">out</span><span style="color:#f92672">.</span><span style="color:#a6e22e">println</span><span style="color:#f92672">(</span>arg<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>Here is the output:</p>
<pre tabindex="0"><code>$ java Sort a c b
a
b
c
$
</code></pre><p>Here is a use that fails:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#f92672">import</span> java.util.*<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span> <span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Sort</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">static</span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">main</span><span style="color:#f92672">(</span>String<span style="color:#f92672">[]</span> args<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> Object <span style="color:#f92672">[]</span> array <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> Object<span style="color:#f92672">[</span>2<span style="color:#f92672">];</span>
</span></span><span style="display:flex;"><span> array<span style="color:#f92672">[</span>0<span style="color:#f92672">]</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> Integer<span style="color:#f92672">(</span>0<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> array<span style="color:#f92672">[</span>1<span style="color:#f92672">]</span> <span style="color:#f92672">=</span> <span style="color:#e6db74">"Hi"</span><span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> Arrays<span style="color:#f92672">.</span><span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>array<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> <span style="color:#f92672">(</span>Object obj <span style="color:#f92672">:</span> array<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> System<span style="color:#f92672">.</span><span style="color:#a6e22e">out</span><span style="color:#f92672">.</span><span style="color:#a6e22e">println</span><span style="color:#f92672">(</span>obj<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>It outputs:</p>
<pre tabindex="0"><code>Exception in thread "main" java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.String (java.lang.Integer and java.lang.String are in module java.base of loader 'bootstrap')
at java.base/java.lang.String.compareTo(String.java:125)
at java.base/java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:320)
at java.base/java.util.ComparableTimSort.sort(ComparableTimSort.java:188)
at java.base/java.util.Arrays.sort(Arrays.java:1249)
at Sort.main(Sort.java:8)
</code></pre><p>The cost of this is acceptable in Java but not in Rust or C++, or C for
that matter. Every object must contain individual metadata if it is to
be sortable through a polymorphic function, and it must be boxed. In C++
or Rust, we must be able to sort arbitrary unboxed data, without extra
metadata included directly within it. But in Java, all types except for
primitives are boxed, only boxed types support polymorphism, and they do
so at the cost of additional data in each heap allocation to do so. And
it works, for Java’s goals, of being a garbage-collected OOP language
with a layer of types to expose errors at compile-time.</p>
<p>As the C example shows, this cost isn’t intrinsic to run-time polymorphism
in general, but it is intrinsic to OOP-style polymorphism. OOP uses
run-time polymorphism at an individual object level as one of its
core features, even when the function does not need to be conveyed on
a per-element basis, but only once.</p>
<h1 id="programming-language-2-sorting-in-c">Programming Language #2: Sorting in C++</h1>
<p>C++, of course, supports this type of run-time polymorphism. We could,
if we wanted, build a system like Java’s, where we had an abstract class
<code>Comparable</code> that we could use to add run-time data to show every object
of a type how to be compared with every other object. We could require
that collections to be sorted contain classes that inherit from – in C++,
inheritance and interface implementation are the same – <code>Comparable</code>.
C++’s run-time polymorphism could be used to implement sorting in the
exact same way as Java.</p>
<p>But that’s not how sorting is implemented in C++. Sorting, in C++,
uses a completely unrelated mechanism of templates. Templates are C++’s
mechanism for static, compile-time polymorphism, just as virtual functions
and inheritance are C++’s mechanism for dynamic, run-time polymorphism
(of a classical OOP variety that closely resembles Java). In spite of
them both being forms of polymorphism, and having many overlapping use
cases, templates and virtual functions are completely unrelated features.</p>
<p>I have seen people argue that templates and virtual functions are
justified in being completely unrelated, because every situation clearly
calls for one or the other. But if it’s possible to do sorting with
run-time polymorphism, as we see from Java, then clearly the distinction
is not clear-cut as all that. What’s to stop a former Java programmer
from using C++’s run-time polymorphism to implement their own sorting
function a la Java, even though that’s not idiomatic C++? There’s clearly
some level of overlap in use cases, even if not in semantics!</p>
<p>So, how do templates actually work?</p>
<blockquote>
<p><strong>Caveat for modern C++ fans</strong>: I’m going to save concepts for the end. They
don’t actually substantially affect my point (as I will explain).
I think it’s simpler to talk about pre-concepts C++ at first, and then
discuss how concepts impact (or rather, don’t really impact) the equation.</p>
</blockquote>
<p>Templates are a form of macro system. A template (class template, function
template, type alias template, etc.) is given parameters at compile-time.
Once the template is given parameters, it is <em>instantiated</em> and stamps out
a concrete component of the program (a class, function, type alias, etc.).</p>
<p>So, that’s quite abstract. This is a situation where an example can
help a lot. In line with our theme, we’re going to write a template
that involves comparisons: given two values of any type that you can
compare (and we’ll have to decide what that means), which is bigger?</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-C++" data-lang="C++"><span style="display:flex;"><span><span style="color:#66d9ef">template</span> <span style="color:#f92672"><</span><span style="color:#66d9ef">typename</span> T<span style="color:#f92672">></span>
</span></span><span style="display:flex;"><span>T max_value(T a, T b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (a <span style="color:#f92672"><</span> b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> b;
</span></span><span style="display:flex;"><span> } <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> a;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>When we actually invoke it, we provide a type for <code>T</code>, giving us
a specialized function where <code>T</code> is replaced by that type.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> max_value<span style="color:#f92672"><</span><span style="color:#66d9ef">int</span><span style="color:#f92672">></span>(<span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>) <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span><span style="display:flex;"><span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> max_value<span style="color:#f92672"><</span>std<span style="color:#f92672">::</span>string<span style="color:#f92672">></span>(<span style="color:#e6db74">"hi"</span>s, <span style="color:#e6db74">"hello"</span>s) <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span></code></pre></div><p>The mere mention of <code>max_value<int></code> creates a function <code>max_value<int></code>,
and likewise for <code>max_value<std::string></code>. This function is the template,
with the template parameter in brackets standing in for <code>T</code>.</p>
<p>Of course, for function templates, specifying the <code>T</code> is optional,
as C++ can infer it, so this code works equally well:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> max_value<span style="color:#f92672"><</span><span style="color:#66d9ef">int</span><span style="color:#f92672">></span>(<span style="color:#ae81ff">3</span>, <span style="color:#ae81ff">4</span>) <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span><span style="display:flex;"><span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> max_value<span style="color:#f92672"><</span>std<span style="color:#f92672">::</span>string<span style="color:#f92672">></span>(<span style="color:#e6db74">"hi"</span>s, <span style="color:#e6db74">"hello"</span>s) <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span></code></pre></div><p>So, what are the resulting functions? It’s very similar to as if
we had written:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-C++" data-lang="C++"><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">max_value</span>(<span style="color:#66d9ef">int</span> a, <span style="color:#66d9ef">int</span> b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (a <span style="color:#f92672"><</span> b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> b;
</span></span><span style="display:flex;"><span> } <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> a;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>std<span style="color:#f92672">::</span>string max_value(std<span style="color:#f92672">::</span>string a, std<span style="color:#f92672">::</span>string b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (a <span style="color:#f92672"><</span> b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> b;
</span></span><span style="display:flex;"><span> } <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> a;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>These are separate functions. The compiler will simply generate as many
separate versions of <code>max_value</code> as it needs to. It outputs separate
assembly language for each of them, and treats them as function overloads,
meaning that it uses the static (compile-time) type of the parameters
to figure out which function to call.</p>
<p>So, from the perspective of someone reading the code, we call
<code>max_value</code> twice, and it figures out how to do its thing on an <code>int</code>
or a <code>std::string</code>. It’s polymorphic, as it does the same algorithm
(finding max) with an operation that changes based on type (<code><</code>). But
from the perspective of someone reading the outputted assembly, it’s
not polymorphic – we’ve simply got two different functions that do
<code>max_value</code> in two different ways.</p>
<p>In other words, we’ve gone from polymorphic code (compile time) to
monomorphic code (run time). This is why Rust calls its equivalent to
template instantiation “monomorphization.” This is also why it’s called
“compile time polymorphism” – it is no longer polymorphic at run-time.</p>
<p>The advantage: This is a zero-overhead abstraction. We’re having the
compiler write, on our behalf, specialized code for each type. We
do not need each element to have virtual function metadata to
indicate how to do comparisons, nor do we even need a function
pointer like with <code>qsort</code>. It’s as optimal as specialized hand-written
code, but we didn’t have to do the specialization.</p>
<p>The disadvantage: We have to know the type at compile-time. This
prevents heterogeneous containers from being possible with this
style of polymorphism. This type of polymorphism can only be based
off of the compile-time type, not based off of changing run-time types.
It is the exact opposite of “late binding” – the binding is done at
compile-time. So, this could not be used for polymorphism over different
types of widgets in a list of widgets.</p>
<p>The other disadvantage: Compile times take longer and the resultant
binary is larger. (Eh, shrug.)</p>
<p>So what operations are needed to support this template? What definition
are we using for “comparable type” for <code>T</code>? We’re not explicitly using
any at all, but note that if the type <code>T</code> doesn’t support the <code><</code> operator,
this code will simply fail to compile:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>max_value(Foo{}, Foo{});
</span></span></code></pre></div><p>Giving the error:</p>
<pre tabindex="0"><code>test.cpp: In instantiation of ‘T max_value(T, T) [with T = main()::Foo]’:
test.cpp:22:14: required from here
test.cpp:8:11: error: no match for ‘operator<’ (operand types are ‘main()::Foo’ and ‘main()::Foo’)
8 | if (a < b) {
| ~~^~~
</code></pre><p>This goes away if we give it the <code><</code> operator.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-C++" data-lang="C++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">bool</span> <span style="color:#66d9ef">operator</span> <span style="color:#f92672"><</span>(<span style="color:#66d9ef">const</span> Foo <span style="color:#f92672">&</span>other) <span style="color:#66d9ef">const</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> false; <span style="color:#75715e">// All Foos are created equal!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>max_value(Foo{}, Foo{}); <span style="color:#75715e">// Now compiles
</span></span></span></code></pre></div><p>If we’d written <code>max_value</code> differently, however, using <code>></code> instead,
this might not have made the error message go away. It turns out that <code><</code>
is the conventional operator to use for comparisons, however, the C++
equivalent to Java’s <code>Comparable</code>, the defining function for “natural
order” by convention.</p>
<p>Is that all that’s required to make <code>max_value</code> work? It turns out no,
as many an astute C++ programmer has probably already noticed. There is
another operation besides <code>operator<</code> required to make <code>max_value</code> work,
and this is because I intentionally made a mistake (so I could reveal
it later to show how subtle templates can be).</p>
<p>Let’s take a look at the instantiation for <code>std::string</code> again,
just the signature:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-C++" data-lang="C++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>string max_value(std<span style="color:#f92672">::</span>string a, std<span style="color:#f92672">::</span>string b);
</span></span></code></pre></div><p>Is that how we’d write <code>max_value</code> by hand for <code>std::string</code>? No, we
wouldn’t. We’d write <code>const std::string &a</code>, and take it by reference,
so that no new objects are initialized in the comparison and return.
If you’re not a C++ programmer, this might seem shocking, but <code>max_value</code>
as we wrote it requires the type to be passable by value, which is a
capability that a type might not have:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-C++" data-lang="C++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> Foo() <span style="color:#f92672">=</span> <span style="color:#66d9ef">default</span>;
</span></span><span style="display:flex;"><span> Foo(<span style="color:#66d9ef">const</span> Foo<span style="color:#f92672">&</span>) <span style="color:#f92672">=</span> <span style="color:#66d9ef">delete</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">bool</span> <span style="color:#66d9ef">operator</span> <span style="color:#f92672"><</span>(<span style="color:#66d9ef">const</span> Foo<span style="color:#f92672">&</span> other) <span style="color:#66d9ef">const</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> false; <span style="color:#75715e">// All Foos are created equal
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>max_value(Foo{}, Foo{}); <span style="color:#75715e">// Error! Error!
</span></span></span></code></pre></div><p>So, we missed the mark, quite by accident! We had an
extra requirement besides comparison, and we can fix that
by taking the value by (<code>const</code>) reference (which is what
<a href="https://en.cppreference.com/w/cpp/algorithm/max"><code>std::max</code></a> does
anyway), which also implies returning by reference:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-C++" data-lang="C++"><span style="display:flex;"><span><span style="color:#66d9ef">template</span> <span style="color:#f92672"><</span><span style="color:#66d9ef">typename</span> T<span style="color:#f92672">></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">const</span> T <span style="color:#f92672">&</span>max_value(<span style="color:#66d9ef">const</span> T <span style="color:#f92672">&</span>a, <span style="color:#66d9ef">const</span> T <span style="color:#f92672">&</span>b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (a <span style="color:#f92672"><</span> b) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> b;
</span></span><span style="display:flex;"><span> } <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> a;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>So what was required from <code>T</code> for us to call <code>max_value</code>?</p>
<p>In one sense, nothing besides that it should be a type! We could pass any
type in for <code>T</code>, and the compiler will plug in the type and chug away,
running into errors only once it has attempted to do so! This might
actually happen several template instantiations deep, and the resulting
error shows up in the template where the operation is attempted, not
in where you use the template with an inappropriate type, which can
be confusing.</p>
<p>In another sense, what is required is that we pass types that make
<code>max_value</code> compile, so in this case, ones that support <code>operator <</code>.
However, there is no guarantee or check that the type is making the
semantic promises that correspond to that type. Sorting, for example,
requires that that operator work in such a way as to define a strict
equivalence class. If that operator doesn’t in fact do that, <code>std::sort</code>
will compile but won’t work properly.</p>
<p>It seems reasonable in this case to expect people to use <code>operator <</code>
for less-than as it’s such a well-established and fundamental operator.
But templates can also invoke named methods. What if somebody writes
a template that calls <code>some_t.foo()</code> expecting it to do one thing,
and someone calls that template with an unrelated class that has a
type-compatible <code>foo</code> method, but with different semantics? There is no
indication to the compiler, when you write the class, that you intend
for <code>foo</code> to be appropriate for use in the template. We didn’t have to
say, when we wrote <code>Foo</code> here, that our <code>operator <</code> was valid for
<code>std::sort</code>.</p>
<p>Concepts do help with that. You can statically assert that a class
supports a concept’s requirements, and that documents your intention
to support it semantically as well. Concepts can also cover stricter
requirements than a template incidentally imposes, and help document
the semantics of templates.</p>
<p>But everything about concepts is opt-in; you can always write a template
that will sometimes fail on instantiation. And that makes them much less
useful in my book. Don’t get me wrong: I’m glad they exist. I think C++
with concepts is better than C++ without concepts. But it only goes so
far, especially when compared with Rust traits, which are mandatory for
Rust’s form of compile-time polymorphism.</p>
<p>More relevant than all of this, to me, is that templates and OOP work
so differently than each other. Run-time polymorphism and compile-time
polymorphism are just completely different beasts. Students are taught
the OOP style run-time polymorphism, and that doesn’t really help them
understand templates, or even get started doing so. Again, I feel C++
is too big.</p>
<p>But, at least it has this zero-overhead abstraction, without requiring
a method look-up and an indirection for every item to be sorted.</p>
<p><code>std::sort</code>, by the way, takes iterators. These iterators must be
value swappable legacy random access iterators, and that’s just a
subset of the requirements, as seen in <code>std::sort</code>’s <a href="https://en.cppreference.com/w/cpp/algorithm/sort">CPPReference
page</a>. The way to
get from one element to another (and therefore implicitly the size),
the way to swap elements, and the way to compare them are all implicitly
derived from <code>RandomIt</code>, the type parameter specifying the type of the
iterator (at least in the overloads of <code>std::sort</code> that do not take an
explicit comparator).</p>
<h1 id="programming-language-3-sorting-in-haskell">Programming Language #3: Sorting in Haskell</h1>
<p>Now for Haskell!</p>
<p>We’re mostly talking about Haskell to move on to talking about Rust,
as this is a Rust-focused blog. There’s a lot going on with Haskell
typeclasses that I won’t have time to get into here.</p>
<p>Haskell is where Rust got traits from, although Haskell
calls them typeclasses. Incidentally, Haskell uses run-time polymorphism
where Rust uses compile-time polymorphism, but the semantics are more
similar than you might expect from that statement.</p>
<p>In Haskell, like Java, all types that <code>sort</code> accepts are boxed, covering
size and swapping among the traits that might need to be customized.
Unlike Java, the operations we need to perform on values of this type
are passed to <code>sort</code> once, rather than looked up on a per-element basis.</p>
<p>Here is the type for <a href="https://hackage.haskell.org/package/base-4.19.0.0/docs/Data-List.html#v:sort"><code>sort</code></a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-haskell" data-lang="haskell"><span style="display:flex;"><span><span style="color:#a6e22e">sort</span> <span style="color:#f92672">::</span> <span style="color:#66d9ef">Ord</span> a <span style="color:#f92672">=></span> [a] <span style="color:#f92672">-></span> [a]
</span></span></code></pre></div><p><code>a</code> here is like <code>T</code> in C++: a type variable that can be replaced
with any type. As in Java, this is subject to type erasure: <code>sort</code>
just operates on generic boxed values. Any comparison-specific operations
it needs come from the <code>Ord a =></code>, which constrains <code>a</code> to types that
have instances of the <code>Ord</code> typeclass.</p>
<p>Here is the definition of <code>Ord</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-haskell" data-lang="haskell"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> (<span style="color:#66d9ef">Eq</span> a) <span style="color:#f92672">=></span> <span style="color:#66d9ef">Ord</span> a <span style="color:#66d9ef">where</span>
</span></span><span style="display:flex;"><span> compare <span style="color:#f92672">::</span> a <span style="color:#f92672">-></span> a <span style="color:#f92672">-></span> <span style="color:#66d9ef">Ordering</span>
</span></span><span style="display:flex;"><span> (<span style="color:#f92672"><</span>), (<span style="color:#f92672"><=</span>), (<span style="color:#f92672">></span>), (<span style="color:#f92672">>=</span>) <span style="color:#f92672">::</span> a <span style="color:#f92672">-></span> a <span style="color:#f92672">-></span> <span style="color:#66d9ef">Bool</span>
</span></span><span style="display:flex;"><span> max, min <span style="color:#f92672">::</span> a <span style="color:#f92672">-></span> a <span style="color:#f92672">-></span> a
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> compare x y <span style="color:#f92672">=</span> <span style="color:#66d9ef">if</span> x <span style="color:#f92672">==</span> y <span style="color:#66d9ef">then</span> <span style="color:#66d9ef">EQ</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">-- NB: must be '<=' not '<' to validate the</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">-- above claim about the minimal things that</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">-- can be defined for an instance of Ord:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">else</span> <span style="color:#66d9ef">if</span> x <span style="color:#f92672"><=</span> y <span style="color:#66d9ef">then</span> <span style="color:#66d9ef">LT</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">else</span> <span style="color:#66d9ef">GT</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> x <span style="color:#f92672"><</span> y <span style="color:#f92672">=</span> <span style="color:#66d9ef">case</span> compare x y <span style="color:#66d9ef">of</span> { <span style="color:#66d9ef">LT</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">True</span>; <span style="color:#66d9ef">_</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">False</span> }
</span></span><span style="display:flex;"><span> x <span style="color:#f92672"><=</span> y <span style="color:#f92672">=</span> <span style="color:#66d9ef">case</span> compare x y <span style="color:#66d9ef">of</span> { <span style="color:#66d9ef">GT</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">False</span>; <span style="color:#66d9ef">_</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">True</span> }
</span></span><span style="display:flex;"><span> x <span style="color:#f92672">></span> y <span style="color:#f92672">=</span> <span style="color:#66d9ef">case</span> compare x y <span style="color:#66d9ef">of</span> { <span style="color:#66d9ef">GT</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">True</span>; <span style="color:#66d9ef">_</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">False</span> }
</span></span><span style="display:flex;"><span> x <span style="color:#f92672">>=</span> y <span style="color:#f92672">=</span> <span style="color:#66d9ef">case</span> compare x y <span style="color:#66d9ef">of</span> { <span style="color:#66d9ef">LT</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">False</span>; <span style="color:#66d9ef">_</span> <span style="color:#f92672">-></span> <span style="color:#66d9ef">True</span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">-- These two default methods use '<=' rather than 'compare'</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">-- because the latter is often more expensive</span>
</span></span><span style="display:flex;"><span> max x y <span style="color:#f92672">=</span> <span style="color:#66d9ef">if</span> x <span style="color:#f92672"><=</span> y <span style="color:#66d9ef">then</span> y <span style="color:#66d9ef">else</span> x
</span></span><span style="display:flex;"><span> min x y <span style="color:#f92672">=</span> <span style="color:#66d9ef">if</span> x <span style="color:#f92672"><=</span> y <span style="color:#66d9ef">then</span> x <span style="color:#66d9ef">else</span> y
</span></span></code></pre></div><p>It defines many methods that an instance of <code>Ord</code> can support.
These methods are functions defined in terms of each other; you
must specifically implement at least one of them for your type to prevent
infinite regress. Minimally, either <code>compare</code> or <code><=</code> is sufficient,
with <code>compare</code> <a href="https://hackage.haskell.org/package/base-4.1.0.0/docs/src/GHC-Classes.html#Ord">recommended</a> for more complex types.</p>
<p>Unlike in C++, when you define these methods, it is not enough to simply
define a function called <code><=</code> or <code>compare</code>. Haskell won’t even let you
define functions with the same fully qualified name as the methods,
which exist in the same namespace as any other functions. Unlike
C++, Haskell does not have function overloading, and any time the
same fully qualified name has different semantics for different types,
it is through this mechanism of <code>typeclass</code>es. Like in Java,
you have to explicitly declare your intention to implement the methods
as found in <code>Ord</code>, by writing an <code>instance</code> explicitly, like so:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Haskell" data-lang="Haskell"><span style="display:flex;"><span><span style="color:#66d9ef">import</span> Data.Ord
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">import</span> Data.List
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">data</span> <span style="color:#66d9ef">Foo</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">Foo</span> <span style="color:#66d9ef">Integer</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">deriving</span> <span style="color:#66d9ef">Show</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">instance</span> <span style="color:#66d9ef">Eq</span> <span style="color:#66d9ef">Foo</span> <span style="color:#66d9ef">where</span>
</span></span><span style="display:flex;"><span> (<span style="color:#66d9ef">Foo</span> a) <span style="color:#f92672">==</span> (<span style="color:#66d9ef">Foo</span> b) <span style="color:#f92672">=</span> a <span style="color:#f92672">==</span> b
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">instance</span> <span style="color:#66d9ef">Ord</span> <span style="color:#66d9ef">Foo</span> <span style="color:#66d9ef">where</span>
</span></span><span style="display:flex;"><span> (<span style="color:#66d9ef">Foo</span> a) <span style="color:#f92672"><=</span> (<span style="color:#66d9ef">Foo</span> b) <span style="color:#f92672">=</span> b <span style="color:#f92672"><=</span> a
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">main</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">do</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> list <span style="color:#f92672">=</span> [<span style="color:#66d9ef">Foo</span> <span style="color:#ae81ff">3</span>, <span style="color:#66d9ef">Foo</span> <span style="color:#ae81ff">4</span>, <span style="color:#66d9ef">Foo</span> <span style="color:#ae81ff">2</span>]
</span></span><span style="display:flex;"><span> print <span style="color:#f92672">$</span> sort list <span style="color:#75715e">-- outputs [Foo 4,Foo 3,Foo 2]</span>
</span></span></code></pre></div><p>Note that the <code>instance</code> declarations are separate from the definition
of the type! The module where the type is declared can define them, but
so can the module where the <code>typeclass</code> is declared. Other modules
are not allowed to by default to make sure there is only one canonical
definition of an instance for a given type and typeclass.</p>
<p>How does this actually work then? Well, <code>Ord a</code> is a secret parameter
to <code>sort</code>. Haskell will create a bundle of function pointers for us
that represent the specific <code>Ord</code> instance for whatever type we pass to
<code>sort</code>, either from knowing the type statically at that point, or passing
along a bundle passed into whatever called <code>sort</code>. So this compiles to
something quite similar to the C <code>qsort</code> (at least as far as polymorphism
is concerned), taking in a comparison function. The big difference is,
Haskell will choose the comparison function for us – but it is one
comparison function, not one comparison function per item as in Java.</p>
<h1 id="programming-language-4-sorting-in-rust">Programming Language #4: Sorting in Rust</h1>
<p>So, how does Rust do all of this?</p>
<p>As I said, a Rust <code>trait</code> is very much
like a Haskell typeclass. Rust’s main <a href="https://doc.rust-lang.org/std/vec/struct.Vec.html#method.sort"><code>sort</code>
method</a>,
like Haskell, requires the <a href="https://doc.rust-lang.org/std/cmp/trait.Ord.html"><code>Ord</code> <del>typeclass</del>
trait</a>. Like Haskell,
it even has provided (but overrideable) methods as well as required
methods:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Rust" data-lang="Rust"><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">trait</span> Ord: Eq <span style="color:#f92672">+</span> PartialOrd {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Required method
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">cmp</span>(<span style="color:#f92672">&</span>self, other: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">Self</span>) -> <span style="color:#a6e22e">Ordering</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Provided methods
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">max</span>(self, other: <span style="color:#a6e22e">Self</span>) -> <span style="color:#a6e22e">Self</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">where</span> Self: Sized { <span style="color:#f92672">..</span>. }
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">min</span>(self, other: <span style="color:#a6e22e">Self</span>) -> <span style="color:#a6e22e">Self</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">where</span> Self: Sized { <span style="color:#f92672">..</span>. }
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">clamp</span>(self, min: <span style="color:#a6e22e">Self</span>, max: <span style="color:#a6e22e">Self</span>) -> <span style="color:#a6e22e">Self</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">where</span> Self: Sized <span style="color:#f92672">+</span> PartialOrd { <span style="color:#f92672">..</span>. }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Like typeclasses, to indicate that a type has a trait requires
a specific block that says what trait we’re trying to implement,
and lists the implementation of the required methods. Like in Haskell,
that block may reside in the crate where the trait is defined, or the
trait where the type is defined. Like in Haskell, this allows us to
add polymorphism to previously unpolymorphic operations without having
to create wrapper types.</p>
<p>Here is an example of implementing this trait (unfortunately, we
have to implement both <code>Ord</code> and <code>PartialOrd</code>):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Rust" data-lang="Rust"><span style="display:flex;"><span><span style="color:#66d9ef">use</span> std::cmp::Ordering;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">#[derive(PartialEq, Eq, Clone, Copy, Debug)]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">Foo</span>(<span style="color:#66d9ef">u32</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> PartialOrd <span style="color:#66d9ef">for</span> Foo {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">partial_cmp</span>(<span style="color:#f92672">&</span>self, other: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">Self</span>) -> Option<span style="color:#f92672"><</span>Ordering<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> other.<span style="color:#ae81ff">0.</span>partial_cmp(<span style="color:#f92672">&</span>self.<span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> Ord <span style="color:#66d9ef">for</span> Foo {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">cmp</span>(<span style="color:#f92672">&</span>self, other: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">Self</span>) -> <span style="color:#a6e22e">Ordering</span> {
</span></span><span style="display:flex;"><span> other.<span style="color:#ae81ff">0.</span>cmp(<span style="color:#f92672">&</span>self.<span style="color:#ae81ff">0</span>)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> foos <span style="color:#f92672">=</span> vec![Foo(<span style="color:#ae81ff">3</span>), Foo(<span style="color:#ae81ff">4</span>), Foo(<span style="color:#ae81ff">1</span>), Foo(<span style="color:#ae81ff">2</span>)];
</span></span><span style="display:flex;"><span> foos.sort();
</span></span><span style="display:flex;"><span> println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{:?}</span><span style="color:#e6db74">"</span>, foos); <span style="color:#75715e">// Displays [Foo(4), Foo(3), Foo(2), Foo(1)]
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>It’s very similar to Haskell, but with “C-like” syntax and aesthetic.
The syntax for the functions using the trait looks like C++ templates:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-Rust" data-lang="Rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">max</span><span style="color:#f92672"><</span>T: Ord<span style="color:#f92672">></span>(a: <span style="color:#a6e22e">T</span>, b: <span style="color:#a6e22e">T</span>) -> <span style="color:#a6e22e">T</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> b <span style="color:#f92672">></span> a { b } <span style="color:#66d9ef">else</span> { a }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>What’s different from Haskell is how it’s implemented. The semantics
are quite similar, and the Rust implementation can be thought of
as an optimization of the Haskell semantics. Instead of passing in
to <code>sort()</code> a secret run-time parameter with <code>Foo</code>’s implementation
of <code>Ord</code>, the function is <em>monomorphized</em>. We can think of it as
inlining just that one parameter at compile-time, and generating
a specialized function.</p>
<p>Yes, this implementation is fundamentally very similar to C++’s
implementation of templates. It’s basically the same in terms of machine
code and resulting optimizations. But the semantics are more
Haskell-like. Polymorphic functions are type-checked once.
They may only use functionality incorporated in the traits at hand.
We don’t postpone the type-checking for the template instantiation.</p>
<p>What’s more, the same mechanism is also used for Rust’s run-time
polymorphism, where we can have a type like <code>dyn MyTrait</code> for
some specific traits that are <em>object-safe</em>. These <em>trait object</em>
types are like OOP polymorphic types, in that each value has its
own copy of the table of polymorphic functions with it, but the
copy is outside the original object. It is a property of the
pointer, not of the object, and implemented with fat pointers.</p>
<p>Like with any other trait, the trait implementation is separate from
the type definition or the trait definition (though it must live in
the same crate as one of them). Unlike C++, there is one system for
polymorphism that can be used in both run-time and compile-time ways,
with overlap where possible.</p>
<h1 id="conclusion">Conclusion</h1>
<p>I hope this shows, if nothing else, that polymorphism itself can take
many forms in many programming languages beyond the OOP variety of it.
The OOP variety is in some senses self-propagating – if you optimize
your language for it as in Java, then it makes sense to use for everything,
even if it’s not what you would choose in a language that has other options.</p>
<p>For many forms of polymorphism, in C++ (for templates), Haskell, and
Rust, no inheritance is necessary. It is simply not built according to
the OOP frame of mind. I personally think Haskell and Rust are doing it
right here, as is perhaps obvious from how I’ve written about it.</p>
<p>I hope to write more about run-time polymorphism in Rust, and how it
differs from the C++ variety, and how you can manually implement other
types of run-time polymorphism if you want. This would be a future
post. But, this is a hobby blog, so no promises on timeline!</p>
A Review of Self-Help as a Genre, and Atomic Habits in Particularhttps://www.thecodedmessage.com/posts/atomic-habits/2024-01-28T00:00:00+00:00I enjoyed reading Atomic Habits, which was recommended to me by my therapist. I found this blog post basically finished in my attic folder while sorting through things, and I found it up to posting, even though my records show I read Atomic Habits way back in … October 2022.
Self-Help in General Atomic Habits is pretty fundamentally a “self help book.” This is a pretty controversial genre in my experience.<p>I enjoyed reading <em>Atomic Habits</em>, which was recommended to me by my
therapist. I found this blog post basically finished in my attic folder
while sorting through things, and I found it up to posting, even though
my records show I read <em>Atomic Habits</em> way back in … October 2022.</p>
<h1 id="self-help-in-general">Self-Help in General</h1>
<p><em>Atomic Habits</em> is pretty fundamentally a “self help book.” This is a
pretty controversial genre in my experience. Some people roll their eyes
at self-help books in general – I once even read an “anti-self help”
book that basically did so for the entire length of a book. Others
swear by them – literally, I had a friend once who said <em>The Subtle
Art of Not Giving a Fuck</em> was her Bible and who used it as such for an
(informal but serious) oath. I’m generally somewhere in the middle of
these two extremes. I read them with solidly middling expectations.</p>
<p>My attitude flavors how I read self-help books. So, before I talk about
<em>Atomic Habits</em> in particular, I want to talk some about self-help books
in general, and my take on them.</p>
<p>I’ll start with their problems, because, as a genre, they sure do have
their problems.</p>
<h2 id="problem-1-length">Problem #1: Length</h2>
<p>They are longer than they need to be, stretched thin by the lengthening.
<em>Atomic Habits</em> clocks in (in my copy) at 264 pages, with an estimated
word count of 80,000 words. However, it provided the insight of a
long blog post, maybe at around 20,000 words at most. And I only give
it credit for that long a blog post because this was a particularly
useful self-help book, which also mitigated its length using re-caps
and summaries arranged in helpful chart forms.</p>
<p>I do understand why publishers do this: they want to publish books, not
glorified pamphlets. I also understand that self-help is far from the
only genre with this problem: it plagues all forms of popular non-fiction.</p>
<p>But it’s still annoying.</p>
<h2 id="problem-2-wildly-varying-standards">Problem #2: Wildly varying standards</h2>
<p>The bigger problem with self-help books is that they vary widely in
quality, not just in terms of evidence for their claims (that if you
follow their advice you will get the results they say you will),
but more importantly in terms of moral quality. Some of them have
questionable values, brazenly teach you how to be manipulative or
otherwise unkind to other people.</p>
<p>This is partially because it’s such a subjective topic. There are no
unified standards on wisdom, no certifying authority. Some books
are written by expert psychologists and psychiatrists, but those are
often framed to specific disorders, and they aren’t always compelling
writers. Others are factual and even science-based, but have goals
that are repugnant to many or even most people. Still others are just
snake-oil or feel-good.</p>
<h2 id="problem-3-they-can-turn-into-religions">Problem #3: They can turn into religions</h2>
<p>One specific pattern of problem is endemic to the genre: over-enamored
with their own importance, they try to provide a comprehensive life
framework to the reader, the “one cool trick” that will fix everything that
ails you. This can lead to the situation where people can buy into it so
hard they idolize it and treat it like a religious text. Simultaneously,
others reject it as overbearing and boundary-crossing while cringing at
such people.</p>
<p>This is kind of easy for self-help writers to do, even by accident:
The nature of the topic makes any life advice in scope, and authors
as living humans generally have some sort of opinions on how to handle
any sort of life situation, that they may already organize internally
into an all-encompassing framework. The nature of writing also requires
organizing those opinions into general principals (often over-general).
And in order to be effective, you have to persuade the reader.</p>
<p>All together, this can lead to over-stating your case for a simplistic
framework. With this One Simple Trick™, with this simple overriding
principle, you can transform your entire life. On such premises religions
are built.</p>
<h2 id="how-i-read-them">How I read them</h2>
<p>As a result of these problems, I tend to take the scope and claims of a
self-help book with a grain of salt. I don’t expect it to transform my
life, or revolutionize me. I don’t trust it, even temporarily, to tell
me how to think about or organize the ideas it presents. I don’t read it
for the overarching framework at all; instead, I just sift through it for
individual useful take-aways, discarding the vast majority of it (even
the majority of the 1/5 of it that isn’t fluff to make it book-length)
as either things I already know or else already know enough to disagree
with. I then can integrate these individual ideas into my own framework
and values.</p>
<p>So my experience goes something like this:</p>
<blockquote>
<p>Huh, that’s an interesting fluff story. Cool, I
see the point you made, but I knew that already. Decent story to back
it up though! Nice phrasing too, but I’ll forget that tomorrow…
Yeah, that page just reiterates that point, wow this could have
been a blog post…</p>
<p>Yeah, I can see the organizational structure you’re using to tie
it together with the framework for your book. It’s not that useful
to me as a life framework, but I’ll treat it as a framework that
holds the book together.</p>
<p>Wait! Aha! There we have a new idea! I’ll take it!</p>
</blockquote>
<p>You might think that if I’m so cynical about self-help books, and think
they have a low information density, then I’d be very disinclined to read
them. But I do actually read them from time to time, especially
when someone recommends them (<em>Atomic Habits</em>, <em>The Subtle Art of Not
Giving a Fuck</em>), or when they’re relevant to me (<em>Taking Charge of Adult
ADHD</em>, which is basically a self-help book but for a specific class
of people to which I belong).</p>
<p>And I’m usually happy I did it. Even though all I get out of it are
maybe a handful of ideas I can take with me, those ideas are sometimes
really good. Some ideas I lean on a lot I’ve gotten from some self-help
book or another – and those are just the ones I’m aware of.</p>
<p>And rehashing concepts I already agreed with, or defending my mind
against concepts I disagree with, is also a useful exercise. I
generally believe in reflection on values and approaches to life.
I wouldn’t go far as to say an unexamined life is not worth living,
but I tend to think that detailed critical thinking is a net positive,
and is not usually driven by anxiety-based “overthinking.”</p>
<p>And I mean, I’ve reflected a lot on my value of reflection and done a lot
of examination on my value of self-examination, and it generally holds up.
Why wouldn’t I want a guided version of that, to get outside the limitations
of my own way of thinking and those of my close friends?</p>
<p>And all in all, I need the excess wisdom. I think we live in a society
with a bit of a wisdom crisis. We don’t have a lot of traditionalism
going on, and to the extent that we do, we live in a different world than
even our parents let alone the worlds of our various scriptures. Humans
need guidance. There’s a reason self-help books sell. There’s a reason
why sometimes people turn them into religions in our heads.</p>
<h1 id="atomic-habits-in-particular">Atomic Habits in Particular</h1>
<p>Now that I’m done waxing philosophical (for now), the natural
question is, what did I get out of <em>Atomic Habits</em>?</p>
<p>Even though it was much longer than it needed to be, I’m not inclined
to summarize it. I don’t even remember everything that’s in there
– most of it I already knew from reading the older <a href="https://charlesduhigg.com/the-power-of-habit/">The Power
of Habit</a> which
this book admits to spending a lot of time rehashing and coming
up with relatively straight-forward and obvious applications.
So I’ll leave summarizing to <a href="https://www.101planners.com/atomic-habits/">another article by someone
else</a> who has done a great
job and whose article is around the length the original book should have
ideally been.</p>
<p>Instead, I’ll say that I enjoyed the review of the material from that
other book, was inspired an appropriate amount, and even got a handful of
take-aways that will stay with me in my internal pile of “wise thoughts.”
Rather than a summary of what the book has to give, you’ll get a list of
what I have taken from it.</p>
<h2 id="my-take-aways">My Take-Aways</h2>
<ul>
<li>You don’t change habits by setting goals, you achieve goals by changing
habits.</li>
</ul>
<p>This one will really stick with me, because they had just told stories
about sports. Every sports team has the same goal: win the tournament.
It’s just that they have different habits to get there. So obviously
setting goals by itself isn’t good enough.</p>
<p>And while this has some overlap with things I already knew (like the
idea setting SMART goals) it did rub the point in in a different enough
way that I felt it was worth adding to my list. If you practice
in effective ways, you will get better enough to do X. You don’t
have to think about that goal, and in fact, it’s probably better
if you focus on enjoying the process.</p>
<ul>
<li>Related to the previous: Your habits are set based on the type of
person you are. So, instead of thinking about goals or habits directly,
consider thinking about what type of person you’re trying to be,
and what they would do.</li>
</ul>
<p>This is a bit more complicated, but it makes sense. Like an evangelical with
a bracelet “What would Jesus do?”, think about the type of person you’re
trying to be, and do what they’d do. In addition to enabling you to be
more moral by emulating religious moral authorities, this also overlaps
with advice on how to be less impulsive from ADHD advice books I’ve read.</p>
<p>It makes sense that this would be able to generalize to more narrow
questions “what would a good writer do?” or “what would a habitual musician
do?”, but I really hadn’t thought about it before.</p>
<p>It also, speaking of Christianity, reminded me of a drawing I saw once
in a book about the Lutheran confessions. It had a tree, and the root of
the tree was <em>Glaub[e]</em>, or <em>faith</em>, and the branch of the tree was <em>Lieb[e]</em>,
or <em>love</em>, and the crown of the tree was <em>Werk</em> or [good] work[s]. The
message was that your values influenced your feelings and attachments,
which influenced your behavior. Focusing on doing good directly was not
the right approach, but rather to focus on what you believed in a core
way.</p>
<p>If anyone can find this drawing, by the way, please let me know!</p>
<h2 id="conclusions">Conclusions</h2>
<p>It was a decent book, for a self-help book. If you’re particularly
struggling with habits, or goal-setting, or trying to motivate yourself,
it might be useful to help you deconstruct where you’re going wrong.
It’s also useful background for understanding human nature a little
better, especially if you’ve never thought about these issues
in detail.</p>
Minor News: Some Repos on GitHubhttps://www.thecodedmessage.com/posts/minor-news/2024-01-21T00:00:00+00:00So, there are now two additional repos of my code on GitHub that recently got published, both under the MIT license. Neither is any show-stopping major project, but I figured I’d let everyone know nevertheless, and write up a few notes about it. Both have been added to my programming portfolio garden.
Repo #1: Crate Version of Prefix Ranges Arvid Norlander (blog, GitHub) reached out to me to ask if I wanted to publish my little Rust module from my post on prefix ranges as a crate, or, failing that, if I could license it as open source so he could publish it.<p>So, there are now two additional repos of my code on GitHub that recently
got published, both under the MIT license. Neither is any show-stopping
major project, but I figured I’d let everyone know nevertheless, and
write up a few notes about it. Both have been added to my
<a href="https://www.thecodedmessage.com/programming-portfolio/">programming portfolio</a> <a href="https://www.thecodedmessage.com/gardens/">garden</a>.</p>
<h1 id="repo-1-crate-version-of-prefix-ranges">Repo #1: Crate Version of Prefix Ranges</h1>
<p>Arvid Norlander (<a href="https://vorpal.se/">blog</a>,
<a href="https://github.com/VorpalBlade">GitHub</a>) reached out to me to ask if
I wanted to publish my little <a href="https://www.thecodedmessage.com/range.rs">Rust module</a> from my post on
<a href="https://www.thecodedmessage.com/posts/prefix-ranges/">prefix ranges</a> as a crate, or, failing that,
if I could license it as open source so he could publish it. I had
thought of most of my code on this blog up until this point as example
code not worth licensing, but his prompting changed my mind. If it’s
just trivial example code, it’s not worth <strong>not</strong> open sourcing, so
I might as well release the website’s example code under an <a href="https://www.thecodedmessage.com/license/">MIT
license</a>.</p>
<p>This particular piece of code seems like the wrong end solution to
the problem at hand – though it is the solution I ended up using
when faced with the problem in a larger project. Ideally, I would
like to write a follow-up piece to the <a href="https://www.thecodedmessage.com/posts/prefix-ranges/">prefix range</a>
article, discussing how to fix <code>BTreeMap</code> to generalize not just
to splitting on various keys based on their ordering properties, but
based on any appropriate function that acts as a range (i.e. that
monotonically transitions from <code>false</code> to <code>true</code> when looping over items in
sorted order by the <code>Ord</code> trait), as a generalization of
<a href="https://doc.rust-lang.org/nightly/std/ops/enum.Bound.html"><code>Bound</code></a>.
Then, prefixes could be represented in terms of such a function, and
we could leverage the full efficiency of a <code>BTreeMap</code> without having
to do any extra UTF-8-mongering.</p>
<p>But fully implementing such a thing would mean patching the standard library,
and fully writing that blog post would mean a lot of benchmarking
work. I still plan on doing it someday, but as I point out many
times, this is a hobby blog (although I do now support <a href="https://www.buymeacoffee.com/thecodedmessage">buying me a
coffee</a>, that is meant in
the true spirit of buying me an extra beverage as a token of thanks. At
the time of this writing no one has clicked it, and I certainly expect
no more than occasional literal coffees to come of any money from it),
and so follow-up posts will happen when they happen (although nagging
me about it, nicely, over e-mail is allowed).</p>
<h1 id="repo-2-texas-hold-em-libraryquiz-app">Repo #2: Texas Hold-Em Library/Quiz App</h1>
<p>I’ve been writing some code to do with the most popular
modern poker variant, Texas Hold-Em. It lives in a <a href="https://github.com/jhartzell42/holdem_rs">repo on
GitHub</a>. Ideally, it’ll turn
into an app to help me and some buddies practice reading flops, counting
outs, seeing who’s ahead, and doing other hold-em mental calculations. I
might also extract a library or even a framework for writing AIs, or
playing against them. Maybe even a front-end app could be added,
either in Rust or in <a href="https://reflex-frp.org/">Reflex</a> in Haskell.</p>
<p>But no promises! See the hobby blog note above! If you really want a
feature, I’ll happily accept PRs!</p>
<p>Of course, this wouldn’t be the first such codebase, or even the first
in Rust. I’m just having run. I enjoyed writing the code so far, and I
figured I’d put it on GitHub in the meantime, even if it never becomes
particularly useful.</p>
<p>Writing it with all its combinatoric randomness made me really learn to
appreciate <a href="https://crates.io/crates/itertools"><code>itertools</code></a>, a collection
of iterator methods that for various reasons haven’t been accepted or
stabilized in the standard library. It’s been good exercise writing in
functional programming, iterator and iterator-transformer style, which
is a little harder in Rust than in Haskell.</p>
<p>Also, while I understand why Rust doesn’t have generators (there is
an excellent <a href="https://without.boats/tags/generators/">blog series</a>
about the topic on <a href="https://without.boats/">“Without Boats”</a>), many of
the reasons are historical and, well, I just really wish it did.</p>
<p>Additional future exploration might include zany optimizations, perhaps
inspired by (but not directly following in the feet of) <a href="http://suffe.cool/poker/evaluator.html">this zany hand
evaluation algorithm</a>
implemented in Rust many places including
<a href="https://github.com/b-inary/holdem-hand-evaluator/blob/main/src/hand.rs#L112">here</a>
by <a href="https://github.com/b-inary/">Wataru Inariba</a> – although regular
optimizations probably come first.</p>
Review: One Billion Americans, by Matthew Yglesiashttps://www.thecodedmessage.com/posts/billion-americans/2024-01-08T00:00:00+00:00This was a great read about how the United States should reframe many of its basic political assumptions.
It is tempting to think of life as a zero-sum game. Having more for me, even enough for me, means less or even not enough for others. Usually, we have the open-mindedness to feel like we can cooperate with some few – our family, our community, or perhaps our nation or religion or even (problematically) our ethnic group.<p>This was a great read about how the United States should reframe
many of its basic political assumptions.</p>
<p>It is tempting to think of life as a zero-sum game. Having more for me,
even enough for me, means less or even not enough for others. Usually,
we have the open-mindedness to feel like we can cooperate with some few
– our family, our community, or perhaps our nation or religion or even
(problematically) our ethnic group. But at a certain scale, there is
a sense that there’s not enough to go around to all the people who
might want it.</p>
<p>This shows up on the right and the left. For the right, our country is
“full,” any immigrants a threat to sparse resources and jobs. For the
left, it is the world that is seen as full: more people necessarily
is seen to mean more environmental damage.</p>
<p>In his book <em>One Billion Americans</em>, Matt Yglesias addresses both
arguments, and addresses them thoroughly. In summary: Our country
is not resource constrained, but constrained by willingness to use
well-established urban planning and transit technology that exists
throughout the world. The way out of environmental damage and climate
change is not asceticism or population restriction, but technology.</p>
<p>By focusing around the provocative premise of an America with three
times the population, both by increased birth rate (a scandal to
liberals) an increased immigration (a scandal to conservatives),
Matt Yglesias creates a framework he can jump off of to explore
a variety of issues. To accomplish this audacious goal, many problems
would have to be fixed in Amerian society, politics, and economics,
for the most part problems that we will have anyway, and that we will
have to fix anyway, whether or not we have in mind the goal of tripling
our population.</p>
<p>As a result, the book covers a variety of seemingly disjoint topics, from
childcare and education to immigration to transit and urban planning.
It therefore avoided the problem a lot of non-fiction books have:
I genuinely feel this book is the correct length. Unlike many similar
books, it could not have just been a blog post, but rather it would have
been a blog series, that is to say, a full-length book.</p>
<p>All in all, a great read, and I am grateful to the friend who gave it
to me this year for my birthday. I generally agree with the positions in it,
and it provoked a lot of good thought.</p>
Is Section 3 of the 14th Amendment Undemocratic?https://www.thecodedmessage.com/posts/14-amendment/2023-12-26T00:00:00+00:00US politics continue to be interesting.
As many of you know, the Colorado Supreme Court has recently ruled that Donald Trump should be struck from the ballot in Colorado. Under Section 3 of the 14th Amendment to the US Constitution, if you’ve sworn to support the Constitution, and then engaged in (or “given aid or comfort to”) an insurrection, you are no longer eligible to serve in office. The Colorado Supreme Court applied this law to Trump, citing the Capitol attack of January 6, 2021.<p>US politics continue to be interesting.</p>
<p>As many of you know, the Colorado Supreme Court has recently
<a href="https://www.courts.state.co.us/userfiles/file/Court_Probation/Supreme_Court/Opinions/2023/23SA300.pdf">ruled</a>
that Donald Trump should be struck from the ballot in Colorado. Under
<a href="#appendix-text-of-the-section">Section 3</a> of the 14th Amendment to the
US Constitution, if you’ve sworn to support the Constitution, and then
engaged in (or “given aid or comfort to”) an insurrection, you are no
longer eligible to serve in office. The Colorado Supreme Court applied
this law to Trump, citing the Capitol attack of <a href="https://en.wikipedia.org/wiki/January_6_United_States_Capitol_attack">January 6,
2021</a>.</p>
<p>This may be the first official ruling to agree that Trump is
disqualified, but the theory has been discussed since the events
of January 6, 2021. The theory gained more serious attention and
respectability when it was endorsed by conservative legal scholars
William Baude and Michael Stokes Paulsen in an <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4532751#">explosive law review
article</a>,
and now, it has finally manifested as an official decision in this
ruling.</p>
<h1 id="opinions-about-the-opinion">Opinions about the Opinion</h1>
<p>There are a lot of criticisms of the ruling. As is often the case with
complicated political issues, there aren’t just two “sides,” but a
grab-bag of more nuanced opinions and observations.</p>
<p>Some people have procedural nitpicks, claiming that
state courts don’t have the jurisdiction to evaluate
such issues, or that Trump would have to actually be
convicted of a crime for the section to apply, perhaps <a href="https://uscode.house.gov/view.xhtml?req=granuleid:USC-1999-title18-section2383&num=0&edition=1999">the crime called
insurrection</a>.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>Others don’t think the riots on January 6 qualify as an insurrection
at all, or they think that Trump didn’t “engage in” it, or they think
that his participation is protected under First Amendment free speech
rights<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>. Still others simply think that Trump should win even if an
insurrectionist, or that he was the legitimate President-Elect in 2021,
or otherwise hold brazenly anti-Democratic views, both in the sense of
hating the Democratic Party as well as democracy itself.</p>
<p>But the criticism that I find most interesting is the criticism that
the court’s decision is undemocratic. We ought to let Trump run, this
criticism goes, because in a democracy, it is better for the voters to
decide that someone should not be President, rather than a court. Some
of the people who think this also fall into another category: they think
this decision is undemocratic and they think January 6 doesn’t reach the
standard of the 14th Amendment, or hasn’t been adequately proven to, and
that I find less interesting. But some people simply think that even if
January 6 was an insurrection, and Trump engaged in it, he should still
be on the ballot, and would still legally become President if elected.</p>
<p>That is to say, there are a large number of people who are uncomfortable
not with the specifics of this ruling but rather with the fundamental
premise of Section 3 of the 14th Amendment. These people might believe
that the law has evolved away from what it says, or that it requires
implementing legislation. Or, these people might believe that we simply
should ignore this constitutional provision, or that we should never
have added it to the constitution. In any case, these people believe that
even a violent insurrection against the government, by a person who had
specifically sworn not to do that, should not be a disqualification to
run for office, or at least, to the extent that it is one, it should be
a qualification decided on by the voters.</p>
<h1 id="narrowing-the-question">Narrowing the Question</h1>
<p>In this post, I will not try to evaluate whether the events of January
6 counts as an “insurrection.” I will not try to figure out whether the
way the Colorado Court proceded was legally correct, nor whether the
Supreme Court will overturn it, nor whether it’s a good strategy for
defeating Trump or will instead backfire.</p>
<p>Instead, I will think about the underlying theoretical question as if
it were not so relevant to today’s news:</p>
<blockquote>
<p>Is it undemocratic to disqualify from elections those who have participated
in an insurrection?</p>
</blockquote>
<p>To help separate this abstract question from the current news, let’s not
imagine that this is about Trump. Let’s make up a new scenario in our
minds, and let’s imagine instead that the insurrection in question was
the Civil War – or some other insurrection that you, as a reader, can
feel comfortable wholeheartedly opposing, in favor of explicit Communism
or Nazism or racism or whatever other ideology most gets your goat.</p>
<p>Let’s further imagine that the candidate openly admits that they, in
fact, did engage in insurrection. In fact, not only do they admit that
the insurrection was an insurrection, but they say it was a justified
one. They admit – or rather, they proudly announce – they’d do it
again. They certainly won’t rule out doing it again if they lose.</p>
<p>But of course, you are not of the opinion that the insurrection was a
justified one – you don’t want this person to win at all. And they’re
running for President!</p>
<p>So here’s the question: Should this person be allowed to run or not? If you
were designing a constitution for your dream country, would you allow
the courts or some other mechanism to stop this person from running,
or would you hope they simply lost at the polls?</p>
<h1 id="all-qualifications-are-undemocratic">All Qualifications Are Undemocratic</h1>
<p>Any disqualification from office, of course, is undemocratic in a sense.
A democratic election for President, in a pure sense, means that whoever
gets the most votes for President must win. And if someone who people want
to vote for isn’t an available option, well, that makes the election
undemocratic.</p>
<p>Everything that detracts from the idealized, pure form of an election
takes our country away from being a democracy. Things like the electoral
college, to the two-term Presidential term limits, to even the restriction
that Presidential candidates must be over 35 and natural-born citizens,
can prevent the people from perfectly exercising their will through a
Presidential election.</p>
<p>However, very few of the people calling this recent ruling “undemocratic”
have any problem with preventing 30 year olds, or foreign-born Americans,
from running for President. Even though this sort of discrimination
based on age or national origin would be severely frowned upon in
hiring<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, we have simply gotten used to them. Perhaps, perhaps,
one could argue that it is a better show of democracy’s power, a more
rational system, to simply allow teenage Presidential candidates to be
disqualified not by law but by the people’s collective decision, but no
one in practice is interested in changing it. We’re simply used to it.</p>
<p>Ironically, these other qualifications, in my mind, strike me as more
undemocratic. If everyone 30 and under had just started being oppressed,
what President could genuinely sympathize? Is the natural-born citizen
requirement today kept because of concern about national loyalties, or
out of racism?</p>
<p>At least excluding insurrectionists has a logical pro-democratic angle.
Committing an insurrection against our elected government is fundamentally
anti-democratic, and so excluding those who have done so is a move to
protect democracy. Unlike requiring people to be 35 or natural-born
citizens, this provision has a claim to protecting democracy at the
same time as it undermines it.</p>
<p>In other words, even though the 14th Amendment <strong>directly</strong> hurts
democracy, by limiting who people can vote for, it <strong>indirectly</strong>
protects it, by keeping people out of power who might do an insurrection
and overthrow democracy. It’s a trade-off: By making this election
slightly less democratic, it protects all future elections’ existence
against the possible insurrectionist’s dictatorship.</p>
<p>The question is then how to evaluate this trade-off.</p>
<h1 id="the-problem-of-cheaters">The Problem of Cheaters</h1>
<p>If you want to find out who the fastest person is, have a race.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>
If you want to find out who the best person is at chess, have a chess
tournament. And if you want to find out who the most-supported<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>
person is for President, have an election.</p>
<p>Even if there is no cheating, this can be irregular and unreliable.
Sometimes, a person is tired, or ate something that disagrees
with them, and that makes them slow of foot. Sometimes, a person has a
brain fart and blunders at chess. And sometimes in a modern media-fuelled
election, someone says a random gaffe that loses them the election,
but not their long-term genuine support, or the election happens right
on the wrong news cycle, or their support is concentrated in the wrong
specific states or demographics.</p>
<p>But cheating is a deeper threat. If the goal of a
race is to find out who is the fastest runner without
taking steroids, or (for example) <a href="https://www.smithsonianmag.com/history/the-1904-olympic-marathon-may-have-been-the-strangest-ever-14910747/">hitching a ride for part of the
race</a>,
then if someone does that, they will win even if they aren’t the fastest
runner. If something isn’t done to prevent people from using steroids,
then everyone will have to use drugs just to have a chance, which totally
goes against the goal of finding out who the fastest runner is without
taking steroids.</p>
<p>So, cheating is a threat to the very concept of a race. How do we stop
cheating? Punishment is a viable option, making the behavior of cheating
have bad consequences as a deterrent. The punishment of disqualification
– from not just the race in which they cheated, but also future races –
goes beyond deterrence, however. Not only does it increase the negative
consequences of cheating (and therefore discourages it), it decreases
the likelihood that someone wins by cheating.</p>
<p>It is true that a one-time cheater might legitimately also be the fastest
person, and win a future race legitimately. But it’s also true that if they
win a future race, they did so by cheating. They’ve proven themselves willing
to cheat. So, if we want to find out who the fastest non-cheater is,
excluding past cheaters is a great way to prevent present cheating.</p>
<p>So, excluding past cheaters from a race can actually make the race more
fair. Even though the past cheater might be legitimately the fastest
person, they should be disqualified, because if they win, what confidence
do we have that they won fairly?</p>
<h1 id="democracy-as-a-peaceful-replacement-for-war">Democracy as a Peaceful Replacement for War</h1>
<p>The analogue to democracy is this: Someone willing to do an insurrection
is likely to be unwilling to give up power in a peaceful manner, likely
to use any power they gain to cheat on future elections, either by
influencing them or by simply refusing to acknowledge and act on the
results. This is true in general, and it is especially true if the
original insurrection directly involves not accepting election results.</p>
<p>After all, the whole point of having a democracy is that we decide who’s
in charge based on who has the most support, rather than by having a war
about it every time. Everyone agrees that fighting with votes is better
than fighting with guns, and as a result, we can have changes in government
without mass death and destruction.</p>
<p>This is especially important given the violence and destruction of modern
warfare. It is not a coincidence that World War I, far more deadly than
any other war that Europe had experienced, also spelled the end of large
absolute monarchies in Europe. Warfare has such an unacceptable cost that
we’ve all collectively decided we’d rather risk our political enemies
winning an election as in a democracy, than have a war every time we need
the government to change, which is how monarchy often works in practice.</p>
<p>So, engaging in insurrection is even more undemocratic than other
types of election cheating. The principle of democracy is vastly more
important than any individual person or party winning, because the
alternative is war and therefore mass death. If a candidate doesn’t
agree, than that candidate is intrinsically undemocratic, to the point
where excluding them is more democratic than allowing them to run.</p>
<h1 id="the-downsides-of-the-ban">The Downsides of the Ban</h1>
<p>I know that people might disagree with these arguments. Banning
insurrectionists from running has cons as well as pros. Who shall
determine who has committed an insurrection? Will an anti-insurrection
provision be abused for political purposes dishonestly, where something
that is not an insurrection is called one for political gain?</p>
<p>Hopefully, the law would specify the procedures for this disqualification,
and indicate who gets to decide. Hopefully, it would choose someone with
enough distance from the political process to actually implement it.</p>
<p>But perhaps that isn’t enough. Perhaps the only way to have a democracy
is for everyone to be eligible, and for it to be seen for everyone to
be eligible. Everything more complicated is up for misinterpretation,
and stokes distrust. Provisions written on paper do not necessarily
accomplish their obvious goals. No amount of clarity of rules can
counteract a dishonest referee, or convince a partisan that the referee
is actually honest.</p>
<h1 id="rule-of-law-and-the-united-states-in-particular">Rule of Law and the United States in Particular</h1>
<p>So, which decision do we make here? Do we have a system for disqualifying
insurrectionists, or not? More important than either decision is having
a rule for it ahead of time. As a democracy, the way to determine this
should be the same as any other determination we make about constitutional
decisions. The rule of law is an important principle, so everyone
knows what the rules are ahead of time (and knows what referees will be
evaluating them). And so, when an insurrection happens, we should ideally
follow the law to determine what to do, rather than having an <em>ad hoc</em>
discussion then to determine how to handle the situation.</p>
<p>And now, I return from the abstract question to the particulars of the
recent decision. The United States has already made this determination,
in the 14th Amendment to the Constitution. Unfortunately, it is unclear
how it is to be enforced; Congress has defaulted on its duty in Section 5
to “enforce, by appropriate legislation, the provisions of this article.”
So, what is clear, is that insurrectionists (who have previously sworn
oaths of office) are ineligible for office. What is unclear is the
details of how this is accomplished.</p>
<p>To me, this means that the question of whether banning insurrectionists
from running for office is already decided. It is not undemocratic
to do so, as it is a policy that has pros and cons for a democracy,
but the rule of law breaks the tie and so we should follow the 14th
Amendment. Questions remain about the details, but that, in our system,
are what the courts are for.</p>
<p>So. I do understand (and disagree with) accusations that the Colorado
Supreme Court decision is politically motivated. I also sympathize with
the claim that Trump should be charged with the crime of insurrection in
order for this case to qualify – perhaps that would be a more fair way to
determine whether Trump’s behavior on January 6 qualifies as engaging in
insurrection. But I thoroughly disagree with those who claim the decision
is “undemocratic.” While there are ways in which the 14th Amendment
is undemocratic, there are also ways in which the opposite policy is
undemocratic. We have already, democratically, made the decision that
this disqualification is part of the rules to our democracy. It is too
late, for this case, to reconsider now whether that decision was wise.</p>
<h1 id="appendix-text-of-the-section">Appendix: Text of the Section</h1>
<blockquote>
<p>No person shall be a Senator or Representative in Congress, or elector
of President and Vice-President, or hold any office, civil or military,
under the United States, or under any State, who, having previously taken
an oath, as a member of Congress, or as an officer of the United States,
or as a member of any State legislature, or as an executive or judicial
officer of any State, to support the Constitution of the United States,
shall have engaged in insurrection or rebellion against the same, or
given aid or comfort to the enemies thereof. But Congress may by a vote
of two-thirds of each House, remove such disability.</p>
<ul>
<li>14th Amendment to the US Constitution, Section 3</li>
</ul>
</blockquote>
<h1 id="appendix-footnotes">Appendix: Footnotes</h1>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Of these, I think the most valid nitpick is
that Trump should have to be convicted of the actual federal crime of
<a href="https://uscode.house.gov/view.xhtml?req=granuleid:USC-1999-title18-section2383&num=0&edition=1999">insurrection</a>
for the amendment to be triggered, as that counts as the Congressional
implementation of the amendment under Section 5. The least valid nitpick,
in my view, is the zany motion that the Presidency is not an “office under
the United States” or that the President does not swear to “support the
Constitution” because he swears instead to “preserve, protect and defend”
the Constitution. Laws are read as documents in natural languages like
English, rather than read as computer programs.
None of this matters in the slightest to the larger arc of this
blog post, which asks, legal technicalities aside, whether the whole
idea is fundamentally undemocratic. <a href="#fnref:1" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:2">
<p>I feel obligated at this point to point out that since the 14th
Amendment comes after the 1st Amendment, technically, free speech might not
apply to its provisions, as the 14th Amendment comes more recently and
therefore can override the 1st Amendment. <a href="#fnref:2" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:3">
<p>Not to mention, they would illegal under various discrimination
laws, though in fairness these notions come after the Constitution was
written. <a href="#fnref:3" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:4">
<p>Though <a href="https://biblehub.com/ecclesiastes/9-11.htm">famously fallible</a>,
a race is still the best way to find out who is fastest. The swift won’t
always win, because of time and chance, but they will more often than not. <a href="#fnref:4" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:5">
<p>“Well-supported” in this sense means with some weighting
given to total popularity, and some weighting given to being able
to dominate in certain states. This is to say, we can mathematically
define “well-supported” so as to make the electoral college make sense.
Or we could not do that, and say the electoral college is undemocratic,
which is a fair position. <a href="#fnref:5" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
</ol>
</div>
2023 in Retrospective and 2024 in Prospectivehttps://www.thecodedmessage.com/posts/2024-thoughts/2023-12-23T00:00:00+00:00Another year has gone by
And in response, I simply sigh
Another year has taken place
I guess I’ll handle it with grace?
Another year, the same old grind…
And yet I feel I’ve fallen behind
As you might know if you’ve read my equivalent post from last year, I am now 35 years old (and 3 days). If we consider “working years” to range from 20 to 65 – which seems a decent definition – then I am 1/3 of the way through them, 1/3 of the way through my career.<p>Another year has gone by<br>
And in response, I simply sigh<br>
Another year has taken place<br>
I guess I’ll handle it with grace?<br>
Another year, the same old grind…<br>
And yet I feel I’ve fallen behind</p>
<p>As you might know if you’ve read my equivalent post
from <a href="https://www.thecodedmessage.com/posts/review-2022/">last year</a>, I am now 35 years old
(and 3 days). If we consider “working years” to range from 20 to
65 – which seems a decent definition – then I am 1/3 of the way
through them, 1/3 of the way through my career. So, theoretically,
we should see <a href="https://www.thecodedmessage.com/resume/">my résumé</a> at least triple in impressiveness
by the time I retire!</p>
<p>Thinking about this year as 1/3 of the way to retirement is definitely
less depressing and existentially terrifying than thinking of 35 as
half-way to 70. I think it’s also more realistic. The type of processes
I was undergoing from 0-20, the type of growth, the type of tasks, the
level of (lack of) freedom, is so different, overall, from my adult life.
Of course, 17 might be a better cut-off year, because that’s when I left
home and went to college, but that kind of takes the spin out of 1/3, so
I’ll keep in terms of 20. And besides, 1/3 of the way through my career
seems appropriate for this blog, as much as I talk about programming
on it!</p>
<p>Like <a href="https://www.thecodedmessage.com/posts/review-2022/">last year</a>, I’d like to reflect on the previous
year. I don’t have such a laundry list of achievements as I mentioned
in that previous post, which is fair: I didn’t rebuild a life (kind of)
from scratch with a different town to live in, housing situation, and
medication (and therefore brain structure).</p>
<p>And indeed, that wasn’t my goal. Unlike 2022 where my theme was
<strong>rebuilding</strong>, my theme for 2023 was <strong>growth</strong>. By “growth,” I meant an
active settling in, a deepening or intensification of the new life I’d
built. And I think I managed that. I spent time settling to the
new life, getting more used to it, getting closer to the people around
me, and solidifying it.</p>
<p>As for the blog, it’s not really growing, which is sad, but it is
approximately holding steady, which is good. 34 posts this year (by
also including this post) compared to 37 last year isn’t too bad:</p>
<pre tabindex="0"><code>$ ls | grep ^2 | cut -f1 -d- | uniq -c # Count posts per year
3 2017
1 2018
17 2019
5 2020
3 2021
37 2022
33 2023
</code></pre><p>Alas, I have not transitioned my blog from mostly polemic to mostly
educational. My <a href="https://www.thecodedmessage.com/posts/oop-3-inheritance">most recent technical post</a>,
instead, very controversially criticized a well-established mechanic
for organizing software complexity. But! I’ve also not let it fade away,
in spite of having had a few curveballs thrown at me this year. And in the
meantime, I’ve also done substantially more writing outside of the blog,
which is not publicly available.</p>
<p>My goals for the blog remain approximately the same as last year.
I’d like to do more educational content. I’d like to write more
non-technical stuff. I have to say, the polemic technical content
gets views and reactions and spark. That’s hard to beat, and the
effort I put in explaining things for more educational content often
gets the reaction of “yep, checks out, makes sense.” Perhaps I can find
a decent balance somewhere – or find a way to keep the educational content
more interesting. If at first you can’t succeed, as they say, try, try
again.</p>
<p>In the past year I did get a new job, working for
<a href="https://www.amtrak.com/home.html">Amtrak</a>. Several of my friends also
got new jobs, two of them specifically becoming teachers. Jobs
transitions are a <em>lot</em>, I can say from inside of one. This came along with
(in my case) a transition from working from home to hybrid, and a commute
that includes a driving component (for the first time in my life!), so that
was a lot.</p>
<p>In my next year, I know what my theme will be, but I’m not entirely
sure what the best word for it is. It will have something to do with
being balanced about how I spend my time, and intentional about how
I spend my emotional resources. <strong>Prioritized</strong> or <strong>focused</strong> might
be it, but not “focused” on productivity or “prioritizing” my work and
chores correctly, but a bit more general than that. Definitely, it has
to do with being intentional about the most precious resources I have:
my evenings and my weekends, so as to make sure I can connect with
the people I care most about while also building in the types of
activities I need to do and maintaining the parts of my life that
need active maintainance.</p>
<p>I suppose the one-word theme will be this: <strong>balance</strong>. I will try to
keep balanced, intentional, well-considered and well-prioritized about
how I spend my time and emotional energy, rather than just dancing from
plan to plan and idea to idea as they arise, and agreeing to things based
on things like guilt or unexamined excitement or even just thoughtless
and distracted accumulation of plans. (No, instead I shall overwhelm
myself with curated and careful accumulation of plans!)</p>
<p>All in all, a very difficult theme perhaps for an <a href="https://www.thecodedmessage.com/tags/adhd">ADHD-er</a>,
but perhaps for that very reason, an important one.</p>
Rust Is Beyond Object-Oriented, Part 3: Inheritancehttps://www.thecodedmessage.com/posts/oop-3-inheritance/2023-12-07T00:00:00+00:00In this next1 post of my series explaining how Rust is better off without Object-Oriented Programming, I discuss the last and (in my opinion) the weirdest of OOP’s 3 traditional pillars.
It’s not encapsulation, a great idea which exists in some form in every modern programming language, just OOP does it oddly. It’s not polymorphism, also a great idea that OOP puts too many restrictions on, and that Rust borrows a better design for from Haskell (with syntax from C++).<p>In this next<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> post of my series explaining how Rust is better
off <a href="https://www.thecodedmessage.com/tags/beyond-oop/">without Object-Oriented Programming</a>, I discuss
the last and (in my opinion) the weirdest of OOP’s 3 traditional pillars.</p>
<p>It’s not <a href="https://www.thecodedmessage.com/posts/oop-1-encapsulation/">encapsulation</a>, a great idea which
exists in some form in every modern programming language, just OOP does
it oddly. It’s not <a href="https://www.thecodedmessage.com/posts/oop-2-polymorphism/">polymorphism</a>, also a
great idea that OOP puts too many restrictions on, and that Rust
borrows a better design for from Haskell (with syntax from C++).</p>
<p>No, it’s that third pillar, inheritance, that I am discussing today, that
concept that only shows up in OOP circles, causing no end of problems
for your code. Unlike encapsulation and polymorphism, Rust does not
have any direct analogue.</p>
<blockquote>
<p><strong>Side note:</strong> In this series in general, but especially in this post,
I am primarily discussing static OOP languages, like C++ and Java,
where interfaces have to be explicit and where classes correspond to
different static types. Much of what I write would have to be adapted
to apply to more dynamic “duck-typing” styles of OOP like
in Python or JavaScript (or Smalltalk), and won’t apply as directly.
This series is about why Rust isn’t OOP, and Rust is closer to C++ or
Java than to a dynamic language, so this bias makes sense in context.</p>
</blockquote>
<h1 id="why-do-people-like-inheritance">Why do people like inheritance?</h1>
<p>I can see why inheritance is so compelling. The entire system
of education encourages us to categorize things into neat little
hierarchies. Rectangles are a type of shape, and squares are a type of
rectangle. Humans are a type of animal, and men and women are types of
humans. Inheritance allows us to take this “X is a Y” and express it to
a computer.</p>
<p>This “is a” relationship is seen as intuitive. As the entire point of OOP
is to make programming more intuitive, more like reasoning about the
real world, inheritance is a perfect match for it. Just like we reason
about the real world with categories and subcategories, we can reason
about the world of our program in a similar way.</p>
<p>And this allows us to feel smart when we read introductions to inheritance
in various books on OOP programming. We see the <code>Tiger</code> class inherit
from the <code>Animal</code> class, or the <code>Rectangle</code> class inherit from the
<code>Shape</code> class.</p>
<p>We get so excited by the abstract principle of “is a” that we don’t even
notice that the examples have nothing to do with programming. We don’t
write code about shapes or animals. And even a drawing program or a zoo
inventory app wouldn’t use inheritance like this! If inheritance was so
useful as to be a pillar of OOP, why are there so few beginner examples
that involve things programs actually do?</p>
<h1 id="what-do-i-mean-by-inheritance">What do I mean by inheritance?</h1>
<p>First, let me clarify what I mean by inheritance, or rather what I don’t
mean.</p>
<p>I don’t mean every subtype-supertype relationship, where all values
of one type are also included in another, broader type. Subtyping
shows up in Rust all the time, particularly when it comes to
<a href="https://doc.rust-lang.org/nomicon/subtyping.html">lifetimes</a>.</p>
<p>I also don’t mean the version of inheritance that only involves
implementing an interface. In C++, you implement dynamic interfaces
through inheritance as a mechanism, even if the “superclass” is just
a list of methods. In Java, inheritance and interface implementation
are separate mechanisms. I am not talking about interface implementation
as inheritance, even though it is technically considered the same
feature in C++:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-C++" data-lang="C++"><span style="display:flex;"><span><span style="color:#75715e">// This class has no fields, only virtual methods.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">//
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">// In Java, we would call this an interface. In Rust, we would
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">// call this a trait.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Shape</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">virtual</span> <span style="color:#66d9ef">void</span> draw(Surface <span style="color:#f92672">&</span>surface) <span style="color:#66d9ef">const</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// This is considered inheritance in C++. The Java equivalent
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">// would use `implements` instead of `extends`. And you could still
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">// do this in Rust with a trait.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Square</span> <span style="color:#f92672">:</span> <span style="color:#66d9ef">public</span> Shape {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> size;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> x;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> y;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> draw(Surface <span style="color:#f92672">&</span>surface) <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">override</span>;
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>I am only opposed to the type of inheritance that is still called
inheritance in Java. Having a type implement an interface (a <em>trait</em>
in Rust) is perfectly legitimate and still allowed in Rust, as is
casting a reference to a value to a generic, “dynamic” value based on
that trait or interface:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">trait</span> Shape {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">draw</span>(<span style="color:#f92672">&</span>self, surface: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">mut</span> Surface);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">Square</span> {
</span></span><span style="display:flex;"><span> size: <span style="color:#66d9ef">u32</span>,
</span></span><span style="display:flex;"><span> x: <span style="color:#66d9ef">u32</span>,
</span></span><span style="display:flex;"><span> y: <span style="color:#66d9ef">u32</span>,
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> Shape <span style="color:#66d9ef">for</span> Square {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">draw</span>(<span style="color:#f92672">&</span>self, surface: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">mut</span> Surface) {
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Assume square is Square, surface is Surface
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">let</span> shape: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">dyn</span> Shape <span style="color:#f92672">=</span> <span style="color:#f92672">&</span>square;
</span></span><span style="display:flex;"><span>shape.draw(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> surface);
</span></span></code></pre></div><p><code>Shape</code>, in this context, is a pure interface. It is only a structured
form of polymorphism, not inheritance per se. Very importantly, <code>Shape</code> has
no fields. It is defined based solely on what you can do with it. And
accordingly, the “is a” language makes sense for interface implementation:
<code>Square</code> is a <code>Shape</code>. A <code>Shape</code> has no state, though, just methods, just
behaviors.</p>
<p>But some parent classes have fields. And that’s when inheritance really
starts to have problems: when the “parent” class has fields. It is at
this point that inheritance starts to seem really weird.</p>
<h1 id="what-does-inheritance-actually-do">What does inheritance actually do?</h1>
<p>In my article on <a href="https://www.thecodedmessage.com/posts/oop-1-encapsulation">encapsulation</a>, I discussed
how a class is secretly two things with the same name, entangled and
conflated:</p>
<ul>
<li>A <strong>record type</strong> (or what Rust would call a <code>struct</code>), that is, a type
whose values consist of a number of fields with fixed names and types</li>
<li>A <strong>module</strong> (a collection of code with enforced encapsulation boundaries),
containing that record type and a collection of functions (called “methods”)
for interacting with it</li>
</ul>
<p>Inheritance does something different with each of these concepts.
To start out, let’s discuss what it does to the record type.
We’ll continue using shapes, a classic example for discussing
object-oriented features. A circle is a shape, so we can use inheritance
here:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Shape</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> Color color;
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Point</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> x;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> y;
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Circle</span> <span style="color:#f92672">:</span> <span style="color:#66d9ef">public</span> Shape {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> Point center;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> radius;
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>So, what does this mean for <code>Circle</code>? Well, it means that all the fields
of <code>Shape</code> (namely, <code>color</code>) are also fields of <code>Circle</code>. Therefore,
references to <code>Circle</code> can be made into references to <code>Shape</code>, as everything
you can do with a shape, you can do with a circle, like set the color,
or get the color:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>Circle circle;
</span></span><span style="display:flex;"><span>Shape <span style="color:#f92672">&</span>shape <span style="color:#f92672">=</span> circle;
</span></span><span style="display:flex;"><span>shape.color <span style="color:#f92672">=</span> Color<span style="color:#f92672">::</span>Blue;
</span></span><span style="display:flex;"><span>assert(circle.color <span style="color:#f92672">==</span> Color<span style="color:#f92672">::</span>Blue);
</span></span></code></pre></div><p>The thing is, we already have a mechanism of taking all the fields of
struct A and putting it in struct B: by putting a field of type A into
struct B! Instead of inheritance’s “is a,” we can accomplish the same
thing with having a field, or “has a.” In our example, we can do the
exact same thing with <code>Point</code> that we did with <code>Shape</code> – it just
involves being a little more explicit about what’s going on:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>Circle circle;
</span></span><span style="display:flex;"><span>Point <span style="color:#f92672">&</span>point <span style="color:#f92672">=</span> circle.center;
</span></span><span style="display:flex;"><span>point.x <span style="color:#f92672">=</span> <span style="color:#ae81ff">3</span>;
</span></span><span style="display:flex;"><span>assert(circle.center.x <span style="color:#f92672">==</span> <span style="color:#ae81ff">3</span>);
</span></span></code></pre></div><p>So, what does inheritance do to the classes from the <strong>record type</strong>
perspective? It makes the parent class a field of the child class,
just a field with no name. By writing:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Circle</span> <span style="color:#f92672">:</span> <span style="color:#66d9ef">public</span> Shape {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span></code></pre></div><p>… from a record type perspective, we were writing syntactic sugar for:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Circle</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> Shape shape;
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span></code></pre></div><p>And when we wrote:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>Shape <span style="color:#f92672">&</span>shape <span style="color:#f92672">=</span> circle;
</span></span></code></pre></div><p>That was translated into something like:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>Shape <span style="color:#f92672">&</span>shape <span style="color:#f92672">=</span> circle.shape;
</span></span></code></pre></div><p>“Is a,” from a record type point of view, is just syntactic sugar for
“has a.” If you want to do something similar in Rust, just make a has-a
relationship, rather than creating an implicit field with no name.
Rust doesn’t like implicit nameless things anyway.</p>
<p>This will also save on arguing about whether two types have an “is a”
or a “has a” relationship. I regret all the time I’ve spent splitting hairs
about that distinction, when really, it’s just a matter of whether we want
a field to be implicit or not.</p>
<p>OK, so that covers what inheritance does to the record types, but
what about the rest of the class, the module? What happens to
the methods?</p>
<p>Well, for non-virtual methods, it’s also straight-forward. Instead of
doing inheritance, you can still just use has-a instead, and do a field
access. Instead of calling, say, <code>circle.get_color()</code>, we could always
call <code>circle.shape.get_color()</code>.</p>
<p>So far, with the fields and non-virtual methods, inheritance just
seems a bit weird and overrated. Like, we don’t see any reason yet
why a programming language would want to support it, when just having
a field of a superclass type does everything. But on the other hand,
some people like implicit fields and convenient short-hands, so there’s
not much of a downside either.</p>
<p>Inheritance without virtual methods may seem harmless, but it doesn’t
have much to do with the concept of “is a.” Technically, you can use a
field access as an implicit conversion, and think of it as a subtyping
relationship, but it doesn’t actually correspond to how the world
works. Even in the world of shapes, it doesn’t make sense: if a square
is a rectangle, how come it has less state than a rectangle, with only
one field for side length instead of two for width and height?</p>
<p>But we’ve not yet talked about virtual methods. When we do, you will
see why I think inheritance is not just an unnecessary feature, but an
ill-conceived anti-feature.</p>
<h1 id="but-what-about-the-virtual-methods">But what about the virtual methods?</h1>
<p>So, earlier we discussed a class as being two things, a record type (with
fields) and a module (with methods and visibility restrictions). But
once we consider virtual methods, a class is actually three things with
the same name:</p>
<ul>
<li>A <strong>record type</strong>: each object has the fields</li>
<li>A <strong>module</strong>: the type, trait, and other methods, are all in an encapsulated
module</li>
<li>A <strong>trait</strong> or <strong>interface</strong>: the virtual methods form an interface</li>
</ul>
<blockquote>
<p><strong>Side note:</strong> some programming languages consider all methods to be
virtual for some reason. For these programming languages, everything I
say still applies, but all methods are in the trait as they’re all virtual.</p>
<p>Given that most methods aren’t self-consciously written with the intent
to be virtual, making methods implicitly virtual seems like a good way
to set the programmer up for surprise – that is, a horrible idea. But
nevertheless having all virtual methods was for a long time considered
the more ideological, more purely OOP way to do things, and so languages
which strove to be purely OOP (like the original Java) did it.</p>
</blockquote>
<p>Up until now, we have ignored this additional conflation,
this additional role that a class plays. In discussing
<a href="https://www.thecodedmessage.com/posts/oop-1-encapsulation/">encapsulation</a>, we were discussing simply
how classes conflate the two distinct concepts of record types and
modules. In discussing <a href="https://www.thecodedmessage.com/posts/oop-2-polymorphism/">polymorphism</a>, we
were assuming interfaces, and discussing how OOP’s version of interfaces
were constrained by insisting on a specific dynamic implementation. Only
now, now that we discuss inheritance, do we see that OOP not only
conflates record types and modules, but it also conflates record types
and interfaces.</p>
<p>When a class has virtual functions, that constitutes an interface,
implemented by dynamic polymorphism. But the only way you are allowed
to implement the interface is by inheriting from the class – that is,
by also having a (secret, unnamed, implicit) field of the record type.</p>
<p>See, as discussed above, inheriting from a class without virtual methods,
a class with just fields and regular methods, is no biggie. It’s just
a weird way of writing a has-a relationship that comes with some syntactic
sugar and automatic conversions – things I’m not a fan of and wouldn’t
put in my programming language, but not that bad.</p>
<p>Similarly, inheriting from a class without fields, a class with just
virtual methods (and perhaps regular methods, it turns out they barely
matter) is also no biggie. It has all the downsides of OOP-style
<a href="https://www.thecodedmessage.com/posts/oop-2-polymorphism/">polymorphism</a>, but is fundamentally just a
way to indicate that you’re implementing an interface. In languages like
C++, inheritance is the mechanism by which you implement interfaces,
and in languages like Java, a methods-only class should probably be an
interface.</p>
<p>(To round out all the possibilities, I will mention that a class with
neither virtual methods nor fields is just a traditional module.)</p>
<p>But if you have both fields and virtual methods, then you have true
OOP-style inheritance, with all of its problems. You have an interface
that you can only implement if you inherit from the class. If you did
not intend this, perhaps because you are writing in a language like
Java where allowing inheritance is the default for classes and virtual
is the default for methods, you are setting yourself up for surprises
when someone inherits from your class and starts overriding methods.</p>
<p>If you did intend this, however, why? Why make implementing an interface
contingent on having certain state, on having a special unnamed field?
Why conflate these two fundamentally different concepts of containing
another record type’s state and having the new record implement an
interface?</p>
<p>There’s a number of problems with this conflation. Why would we assume
that in order to implement the methods, you need that state? What if that
state is represented differently, like on a disk, or over a network, or as
mathematical consequences by a formula? This conflation of implementation
and interface means that there is no sane way to implement proxy objects.</p>
<p>But more importantly than that, I’m not entirely sure what the upside
of this conflation is. It seems to make programming simpler in one
particular scenario, a scenario that I rarely see come up in real life,
a scenario that frankly seems like a code smell.</p>
<h1 id="so-what-can-we-do-instead">So what can we do instead?</h1>
<p>There is no inheritance in Rust. There are no fields in traits. There is
simply no way of saying that in order to implement a trait, your type
must have certain fields. Rather than conflate the concepts of record
types, modules, and traits in this God-concept of “class,” Rust keeps
these three concepts quite separate.</p>
<p>So if we have a design that requires inheritance (either because we think
in OOP or because we’re translating from an OOP programming language),
how would we represent that in Rust?</p>
<p>Well, the most straight-forward way would be to separate out the
different parts of the base class. Such a refactor would allow us to
express our design in Rust, as literally as possible. This is just meant
as a starting point, a proof of concept that our design can survive
in a language without inheritance. Alternative, often better ways of
replacing inheritance will follow subseqeuntly.</p>
<p>But here’s the straight-forward method: If the base class has just fields,
or just virtual methods, that’s easy: it becomes a <code>struct</code> or a <code>trait</code>,
respectively. Instead of inheriting from the class, a type would have that
<code>struct</code> as a field, or implement that <code>trait</code>. Actually, in this case,
the straight-forward method might just be perfect – you weren’t actually
using inheritance per se, just an odd syntax for a field or for implementing
an interface.</p>
<p>If it has both, we’d have to extract both a <code>struct</code> and a <code>trait</code>.
The fields would become a <code>struct</code>, of its own type. The interface of the
virtual methods would become a trait. The implementation of the virtual
methods would become the implementation of that trait for that <code>struct</code>,
or provided methods on the trait, depending on what makes more sense. Any
non-virtual methods would then become methods of the <code>struct</code> or provided
methods on the trait, again depending on what makes more sense in context.</p>
<p>At this point, it might make sense to consider some of the alternatives
that Rust provides to run-time polymorphism, as discussed in the
<a href="https://www.thecodedmessage.com/posts/oop-2-polymorphism/">polymorphism</a> post. Is a trait, especially
an OOP-style, object-safe trait, really what we want here? We’ve opened
up alternative designs now, and perhaps one of the alternatives makes
more sense.</p>
<p>Assuming we do want a trait, we can then go to all the “child” classes
and make them implement the trait. They also get a new field, perhaps
named <code>super</code>, to contain the parent. Their trait implementations would
then do a mix of implementing new methods, calling the same method on
<code>super</code>, and defaulting to the provided method.</p>
<p>And again, at this point it would be appropriate to consider whether we
even need the <code>super</code> field, or if perhaps we can get away with not
having it.</p>
<p>After this transformation, we have valid Rust code out of our
inheritance-based OOP-style design pattern. But there’s nothing requiring
us to use Rust to do it: you could do the same refactor of inheritance
structures in an OOP language.</p>
<p>If we were to do this transformation, we’ve paid a small cost of having
to potentially write <code>.super</code> (or whatever name we’ve given the parent
field) every once in a while, as well as writing trait implementations
that forward some method calls to the <code>super</code> field. In return, we’ve
deconflated the two very different concepts of interface and fields,
and opened ourselves up to more possibilities.</p>
<h1 id="what-should-i-actually-do-in-rust-instead-of-inheritance">What should I actually do in Rust instead of inheritance?</h1>
<p>But notice that in discussing this transformation, I encouraged you to
consider alternatives at two points. Rarely does this transformation make
sense literally, which is to say, rarely does a literal translation of
inheritance into Rust make sense. I find this quite telling, as it implies
to me that inheritance itself only rarely makes sense – and indeed, I
only tend to use inheritance in OOP languages where a framework requires
me to, or as an <em>ersatz</em><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> replacement of sum types (i.e. Rust <code>enum</code>).</p>
<p>Here are some other patterns that replace inheritance hierarchies, that
you might find yourself considering instead:</p>
<ul>
<li>A regular <code>enum</code>. This actually covers most situations for me. Methods that
would be overriden just do a <code>match</code> on the <code>enum</code> contents, and methods
that would not, do not.</li>
<li><code>struct</code> types that contain a field with an <code>enum</code> types. The <code>enum</code>
type represents all the different options, but the <code>struct</code> type contains
the fields that are always the same.</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">MessageHeader</span> {
</span></span><span style="display:flex;"><span> source: <span style="color:#a6e22e">Address</span>,
</span></span><span style="display:flex;"><span> destination: <span style="color:#a6e22e">Address</span>,
</span></span><span style="display:flex;"><span> seqnum: <span style="color:#66d9ef">u32</span>,
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">MessageBody</span> {
</span></span><span style="display:flex;"><span> Ping(PingMessage),
</span></span><span style="display:flex;"><span> Pong(PongMessage),
</span></span><span style="display:flex;"><span> Request(RequestMessage),
</span></span><span style="display:flex;"><span> Response(ResponseMessage),
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">Message</span> {
</span></span><span style="display:flex;"><span> header: <span style="color:#a6e22e">MessageHeader</span>,
</span></span><span style="display:flex;"><span> body: <span style="color:#a6e22e">MessageBody</span>,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Isn’t this so much nicer than putting <code>source</code>, <code>destination</code>, and
<code>seqnum</code> in the base class?</p>
<ul>
<li><code>enum</code> variants that themselves contain <code>enum</code> types.</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">Message</span> {
</span></span><span style="display:flex;"><span> Client(ClientMessage),
</span></span><span style="display:flex;"><span> Server(ServerMessage),
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">ClientMessage</span> {
</span></span><span style="display:flex;"><span> Ping(PingMessage),
</span></span><span style="display:flex;"><span> Request(RequestMessage),
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">ServerMessage</span> {
</span></span><span style="display:flex;"><span> Pong(PongMessage),
</span></span><span style="display:flex;"><span> Response(ResponseMessage),
</span></span><span style="display:flex;"><span> Error(ErrorMessage),
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Now, if you want any message, your type is <code>Message</code>. If you know
for sure you have a client message, you can say <code>ClientMessage</code>.
Or if you know for sure it’s specifically a ping, you can say
<code>PingMessage</code>. It’s like a class hierarchy!</p>
<ul>
<li>A <code>struct</code> with a template-parameterized member to set a policy.</li>
</ul>
<p>This is perhaps the most sophisticated replacement. Imagine you
have a class <code>SocketHandler</code> that handles reading from a socket.
Imagine it looks like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">SocketHandler</span> {
</span></span><span style="display:flex;"><span> CircularBuffer socket_data;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> data_available(<span style="color:#66d9ef">int</span> fd);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">protected</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">virtual</span> size_t message_size(<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>data, size_t size) <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">virtual</span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">process_message</span>(<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>data, size_t size) <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>How this is going to work is, <code>data_available</code> is going to grab more and
more data from the socket <code>fd</code> until <code>message_size</code> returns a non-zero
value. Then, it’ll call <code>process_message</code> with that data. During this
time, it’ll store the data in <code>socket_data</code>. All of that work is being
done by <code>data_available</code>, in the parent class, and you can imagine
that the socket dispatching library has a collection of these
socket handlers, something like <code>std::vector<std::unique_ptr<SocketHandler>></code>
(or perhaps a map indexed by file descriptor).</p>
<p>The child class is responsible for overriding <code>message_size</code> and
<code>process_message</code> to actually interpret incoming data for a specific
protocol. You’d have a child class for each <code>SocketHandler</code> protocol,
and it would include internal state like sequence numbers, etc.</p>
<p>But rather than have these methods overriden by a child class, the
right way to do it is to have just those methods in a trait that a
<code>SocketHandler</code> has. You can see this when you extract the implicit
trait for <code>SocketHandler</code> for the Rust version:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">trait</span> SocketProtocol {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">message_size</span>(<span style="color:#f92672">&</span>self, data: <span style="color:#66d9ef">&</span>[<span style="color:#66d9ef">u8</span>]) -> <span style="color:#66d9ef">usize</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">process_message</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self, data: <span style="color:#66d9ef">&</span>[<span style="color:#66d9ef">u8</span>]) -> Result<span style="color:#f92672"><</span>()<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">SocketHandler</span><span style="color:#f92672"><</span>P: <span style="color:#a6e22e">SocketProtocol</span><span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> buffer: <span style="color:#a6e22e">CircularBuffer</span>,
</span></span><span style="display:flex;"><span> protocol: <span style="color:#a6e22e">P</span>,
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">trait</span> SocketHandlerTrait {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">data_available</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self, fd: <span style="color:#66d9ef">u32</span>) -> Result<span style="color:#f92672"><</span>()<span style="color:#f92672">></span>;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span><span style="color:#f92672"><</span>P: <span style="color:#a6e22e">SocketProtocol</span><span style="color:#f92672">></span> SocketHandlerTrait <span style="color:#66d9ef">for</span> SocketHandler<span style="color:#f92672"><</span>P<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">data_available</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self, fd: <span style="color:#66d9ef">u32</span>) -> Result<span style="color:#f92672"><</span>()<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Call `self.protocol.message_size/process_message`
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>So, rather than each socket protocol inheriting from socket handler,
with its common state, the socket handler <em>has a</em> socket protocol, as
a policy. The <code>SocketProtocol</code> trait here can then be a compile-time,
static trait and <code>SocketHandlerTrait</code> can be the object-safe, dynamic
one, and the <code>std::vector<std::unique_ptr<SocketHandler>></code> can
be replaced with <code>Vec<Box<dyn SocketHandlerTrait>></code>.</p>
<p>This last refactor can be generalized. Instead of inheriting from
a base class to implement specific functionality, inject that
functionality using policies<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>, and parameterize the <code>struct</code> with members that implement policy traits.
Then, if need be (and need might not be) write a separate dynamic trait
for the overall <code>struct</code>.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>I know my <a href="https://www.thecodedmessage.com/posts/oop-2-polymorphism">last post</a> hasn’t been
since February. I’ve been procrastinating this one for a long time,
mostly because my life has been so gosh-darn busy, and also mostly
because I don’t really instinctively remember what I (or anyone else)
really liked about inheritance to begin with. <a href="#fnref:1" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:2">
<p>Isn’t it weird that <em>ersatz</em> means replacement in German, but
means mediocre as a replacement in English, so that “ersatz replacement”
doesn’t mean “replacement replacement” but “mediocre replacement”?
Or am I using the English word wrong? <a href="#fnref:2" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:3">
<p>Policies are known in Gang of Four terminology as
<a href="https://en.wikipedia.org/wiki/Strategy_pattern">strategies</a>. I’ve touched
on the policy pattern in some <a href="https://www.thecodedmessage.com/posts/endian_polymorphism/">previous</a>
<a href="https://www.thecodedmessage.com/posts/multiparadigm/">posts</a>, and at some point should write a full
post about it, as policies are my favorite thing. <a href="#fnref:3" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
</ol>
</div>
Are You Sure? (Revised)https://www.thecodedmessage.com/posts/are-you-sure-v2/2023-10-24T00:00:00+00:00This is a revision of a flash fiction piece first published in 2018.
After a year of talking, and another year of planning, the project was complete. Mothers Against Drunk Driving, the local clergy, and the town council had finally done it: Right in the town square, they installed a giant loudspeaker. From thenceforth, every two minutes, a booming voice would spread all over town, announcing:
ARE YOU SURE?
Foolhardy decisions, they had decreed, would soon be a thing of the past.<p><em>This is a revision of a flash fiction piece first
<a href="https://www.thecodedmessage.com/posts/are-you-sure/">published in 2018</a>.</em></p>
<p>After a year of talking, and another year of planning, the project was
complete. Mothers Against Drunk Driving, the local clergy, and the town
council had finally done it: Right in the town square, they installed a
giant loudspeaker. From thenceforth, every two minutes, a booming voice
would spread all over town, announcing:</p>
<p><strong>ARE YOU SURE?</strong></p>
<p>Foolhardy decisions, they had decreed, would soon be a thing of the past.</p>
<p>The locals seemed to adapt pretty readily. Sales of noise-canceling
headphones boomed for a bit, and people’s sleeping habits were
surprisingly unaffected – who notices slightly inferior sleep? And
drunk driving statistics were immediately better, which the local paper
celebrated triumphantly.</p>
<p>The clergy were the first to notice the downsides. Weddings were being
canceled during the vows a full 25% of the time – brides and grooms
would take back their “I do"s in response to the booming speaker of
skepticism. Adult baptisms were fully cut in half. Divorces, on the other
hand, were also cut in half – though some of the rescued marriages
maybe shouldn’t have been.</p>
<p>At a town council meeting, one of the proponents of the loudspeaker said,
confidently, this is a good idea, only to cringe when the timing worked
out that in the next second, the entire room boomed:</p>
<p><strong>ARE YOU SURE?</strong></p>
<p>No one was starting new relationships – and no one was exiting them
either. New job postings languished unfilled, as both candidate and
interviewer expressed their doubt. Slowly, but surely, the social and
economic life of the town started to grind to a halt, as it became
the norm to cancel even casual plans like going out for a drink (and
certainly having another once there), or going to church on Sunday…or
work or school on Monday.</p>
<p>Over the days and months and years, the town developed a culture
of its own. Only necessities were bought, and only emergencies were
handled. Trash piled up on the streets as no one collected it, and then
stopped piling up as no one threw anything out. No one remembered what
life was like before, and fewer and fewer visitors passed through to
challenge it.</p>
<p>It wasn’t just the loudspeaker: people repeated its eternal mantra to
each other, having had it etched into their dreams. “We should take
down the loudspeaker,” said an occasional rebellious teen, only to
hear all their friends in unison say back, “Are you sure?”, echoed,
a moment later, by the loudspeaker itself:</p>
<p><strong>ARE YOU SURE?</strong></p>
<p>Eventually the loudspeaker broke. The mayor told his deputy to fix it,
but all the deputy could do was respond, “Are you sure?” The rest
of the town council waited for the booming voice to agree, but even
without the voice, no one was bold enough to fix the loudspeaker. It
wasn’t enough of an emergency, and besides, it felt like hubris or
even blasphemy to presume to be able to fix what seemed like such a
fundamental part of their world.</p>
<p>Without the loudspeaker, slowly but surely, the town returned to normal. A
year later, people would only say “Are you sure” as a joke – one that
many found vulgar and tasteless. And today, the teenagers wonder why, in
the middle of the town square, there is a looming hulk of a loudspeaker
system, never turned on or used, never cleaned up or put away. And of
course, it is now only the old who endlessly repeat what was once a
mantra, as their adult children shake their heads.</p>
Endianness, and why I don't like htons(3) and friendshttps://www.thecodedmessage.com/posts/endianness/2023-10-19T00:00:00+00:00Endianness is a long-standing headache for many a computer science student, and a thorn in the side of practitioners. I have already written some about it in a different context. Today, I’d like to talk more about how to deal with endianness in programming languages and APIs, especially how to deal with it in a principled, type-safe way.
Before we get to that, I want to make some preliminary clarifications about endianness, which will help inform our API design.<p>Endianness is a long-standing headache for many a computer science
student, and a thorn in the side of practitioners. I have already
<a href="https://www.thecodedmessage.com/posts/endian_polymorphism/">written some about it</a> in a different
context. Today, I’d like to talk more about how to deal with endianness
in programming languages and APIs, especially how to deal with it
in a principled, type-safe way.</p>
<p>Before we get to that, I want to make some preliminary clarifications
about endianness, which will help inform our API design.</p>
<h1 id="why-little-endian-bugs-us">Why Little Endian Bugs Us</h1>
<p>New students often are more confused by little endian (where the
least-significant component of an integer is stored first), and until
they are told about it, they tend to assume computers are big endian
(where the most-significant component is stored first) even if they don’t
know that word. This is due primarily to the fact that big endian is what
they’re used to: We write numbers with the most significant digit on the
left, and in languages that write from left to write (including English,
the <em>lingua franca</em> of programming among other things), this means that
we live our day to day lives in big endian. But that doesn’t mean that
big endian is more logical in any way, just that it is more conventional.</p>
<p>This isn’t helped by the fact that many learners are first exposed to
little endian by it being confusing, and making them do more cognitive
work, by reading little endian numbers from a hex dump. Take, for example,
this code, which displays a 32-bit number in hexadecimal, and then displays
the individual bytes of the same number as a hex dump:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#66d9ef">uint32_t</span> number <span style="color:#f92672">=</span> <span style="color:#ae81ff">0x12345678</span>;
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">printf</span>(<span style="color:#e6db74">"%08X</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">"</span>, number);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">uint8_t</span> bytes[<span style="color:#ae81ff">4</span>];
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">memcpy</span>(bytes, <span style="color:#f92672">&</span>number, <span style="color:#ae81ff">4</span>);
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">printf</span>(<span style="color:#e6db74">"%02X %02X %02X %02X</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">"</span>, bytes[<span style="color:#ae81ff">0</span>], bytes[<span style="color:#ae81ff">1</span>], bytes[<span style="color:#ae81ff">2</span>], bytes[<span style="color:#ae81ff">3</span>]);
</span></span></code></pre></div><p>This results in this befudding output:</p>
<pre tabindex="0"><code>12345678
78 56 34 12
</code></pre><p>When read as a number, we can just read the number normally.
However, when read as a series of bytes, we find ourselves having
to read the number from right to left to read the number as big
endian, as we are accustomed to doing. We can’t even just read
backwards, however, as each byte is still printed internally
according to our big endian convention: the higher-order hex digit
is still printed first, followed by the lower-order hex digit.</p>
<p>The problem here isn’t little endian. The problem is that the printing
functionality accommodates our big endian preference in printing, but
only at the level of printing an individual number, either as a byte
or as a 32-bit word. The word printed as a whole is printed big
endian, to accommodate us. The individual bytes are also printed
big endian, to accommodate us. However, the hex dump as a whole
is printed with the lower values on the left, and the higher values
on the right, to similarly accommodate our values that lower-indexed
memory, memory that comes earlier, should be on the left. On a little
endian system, this desire to print each number with the most
significant digit on the left, but to print a sequence of numbers
from left to right, leads to the contradiction. The resulting
last line, <code>78 56 34 12</code>, isn’t, properly speaking, little endian.
The print-out is an odd type of mixed endian, due to our awkward
conventions.</p>
<p>There is actually a relatively easy fix: if we insist on reading
numbers with the most significant digit on the right (which we do),
and the computer insists on storing less significant components
first (which it does), these two desires can be reconciled by printing
the hex dump from right to left:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#66d9ef">uint32_t</span> number <span style="color:#f92672">=</span> <span style="color:#ae81ff">0x12345678</span>;
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">printf</span>(<span style="color:#e6db74">"%08X</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">"</span>, number);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">uint8_t</span> bytes[<span style="color:#ae81ff">4</span>];
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">memcpy</span>(bytes, <span style="color:#f92672">&</span>number, <span style="color:#ae81ff">4</span>);
</span></span><span style="display:flex;"><span><span style="color:#a6e22e">printf</span>(<span style="color:#e6db74">"%02X %02X %02X %02X</span><span style="color:#ae81ff">\n</span><span style="color:#e6db74">"</span>, bytes[<span style="color:#ae81ff">3</span>], bytes[<span style="color:#ae81ff">2</span>], bytes[<span style="color:#ae81ff">1</span>], bytes[<span style="color:#ae81ff">0</span>]);
</span></span></code></pre></div><p>This results in a much cleaner print-out:</p>
<pre tabindex="0"><code>12345678
12 34 56 78
</code></pre><p>This should make clear that the weirdness of little endian is
entirely due to our preference for big endian, and our preference for
listing the lower-indexed values to the left, and how these preferences
interact. It is because of human conventions, not because of any intrinsic
problem with little endian. I would argue that, on little endian systems,
all hex dumps should be right to left, and that would help, but there is
little I can do to change the conventions of this.</p>
<p>Now, almost all modern systems are little endian, either because they are
typically configured that way for processors that support either endianness,
or because they only support little endian, like Intel processors. The
few programmers who have to write code for big endian systems find themselves
in the minority, and find themselves doing extra work to deal with other
code that no longer accommodates big endianness.</p>
<p>There is one big exception to this: the Internet. All of the Internet
protocols are designed to use big endian ordering, known in this context
as “network byte ordering.” This is because when the Internet protocols
were developed, big endian was a viable rival to little endian, and
both byte orders were common.</p>
<p>This does make some sense, as well, because hex dumps of packets are very
common, and big endian does make those hex dumps easier to read and reckon
with for us big endian humans.</p>
<h1 id="when-endianness-comes-in">When Endianness Comes In</h1>
<p>I would also like to clarify something about how endianness works.
A 32-bit word in a register in the processor is neither big endian
nor little endian. The processor needs to be designed knowing which
bits are more significant, and which are less, but there is no
intrinsic way in which the less significant bits come “first.”
In a word-based memory system, where only entire words were stored
in memory (like the PDP-7 was with its 16-bit words), and where
it was impossible to address memory in terms of individual bytes,
this would be the end of it.</p>
<p>As an example of this, see the documentation for
<a href="https://en.cppreference.com/w/cpp/types/endian"><code>std::endian</code></a>
on <a href="https://en.cppreference.com/">CppReference.com</a>:</p>
<blockquote>
<p>If all scalar types have <code>sizeof</code> equal to 1, endianness does not matter
and all three values, <code>std::endian::little</code>, <code>std::endian::big</code>, and
<code>std::endian::native</code> are the same.</p>
</blockquote>
<p>However, once we come up with the idea that memory is made up of bytes,
the endianness question arises: How do we split this 32-bit number into
bytes? Which end of it should be byte 0, and which end byte 3? Similarly,
if we read a series of bytes into memory, where should the first byte
(by memory address) go in the register, the most significant (big)
end, or the least significant (little) end?</p>
<p>As a result, types like <code>uint32_t</code> (and <code>uint16_t</code> and <code>uint64_t</code>) have no
intrinsic endianness, so long as they are stored in registers. Only if
they are written to memory, or read from memory, does their endianness
matter. And then, it only matters if the actual byte representation is
important – if we, as in the code above, use <code>memcpy</code> to copy their
representation, byte by byte, into an array of bytes.</p>
<p>In general, if the byte representation does matter, I would argue that
<code>uint32_t</code> should be treated as an abstract 32-bit value, devoid of
endianness. Only when it is transcribed as a series of bytes should
endianness be taken into account – and then the description should
instead have the type of <code>uint8_t[4]</code> in C (or <code>std::array<uint8_t, 4></code>
in C++ or <code>[u8; 4]</code> in Rust).</p>
<h1 id="the-main-argument-why-i-dislike-htons-and-friends">The Main Argument: Why I dislike <code>htons</code> and friends</h1>
<p>In C, however, we do not in fact do this. We instead have functions
like <code>htons</code>, with this signature:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c" data-lang="c"><span style="display:flex;"><span><span style="color:#66d9ef">uint16_t</span> <span style="color:#a6e22e">htons</span>(<span style="color:#66d9ef">uint16_t</span> hostshort);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">uint16_t</span> http_port <span style="color:#f92672">=</span> <span style="color:#a6e22e">htons</span>(<span style="color:#ae81ff">80</span>);
</span></span></code></pre></div><p>This function purports to convert a 16-bit number from host endianness
(typically little) to network endianness (always big). Assuming a little
endian computer, it does a byteswap: It swaps the less significant 8
bits with the more significant 8 bits in the register used to return
the <code>uint16_t</code>.</p>
<p>So what are the properties of the returned <code>uint16_t</code>? If we passed in,
for example, 80 (the port of HTTP), <code>http_port</code>, the new <code>uint16_t</code> is
20480 – because 80 is <code>0x0050</code> in hex, and we’ve swapped the two bytes,
so we now have <code>0x5000</code>. What is this number?</p>
<p>It is not, to be clear, a <code>uint16_t</code> value 80 that is now in “big endian,”
though we might say that as a manner of speaking. It is almost certainly
in a register, and as mentioned before, registers don’t have intrinsic
endianness. It is something far more awkward: It is a value that, if
we were to store it in little endian (the only option), results in a
different number being stored in big endian.</p>
<p>To expand on this: 20480 is not a particularly meaningful number. It is
not actually the port number we want to use. And it has nothing to do
with the actual number 20480. It is simply a number that, if we store
it in memory as bytes, will result in <code>0x00</code> being stored, followed by
<code>0x50</code> – the big endian representation of 80. It is a <code>uint16_t</code> with
a value chosen not for what number we want to store, but what bytes we
will get if we store <code>http_port</code> as bytes.</p>
<p>Since <code>uint16_t</code> is designed to store numbers, not collections of bytes,
I would argue that this type is not being used in a semantically honest
way – it is a lie. What we are really storing is an array of 2 bytes,
2 <code>uint8_t</code>s. We are storing it in a 16-bit register, and implementation-wise
that might be a good decision – but I would argue, if we want that to
be possible, we should create an ABI where <code>uint8_t[2]</code> should be storable
in a single register. The C programming languages, by not making arrays
first-class types, is getting in our way here, which explains
the situation.</p>
<p>Am I exaggerating when I say the type is a lie? Well, we expect to be
able to do arithmetic on a <code>uint16_t</code>, to be able to test, for example,
whether it is less than 1024, as listening on a port less than 1024 is
a privileged operation. But in order to do that, we have to convert
it back to a normal <code>uint16_t</code> – all <code>uint16_t</code>’s usual arithmetic
operators are inappropriate for data that’s stored with its bytes
swapped around.</p>
<p>So what should be done? Well, if we really intend to express a value
in network byte order, e.g. big endian, we are changing the semantics
of the information from “this is a 16-bit integer” to “this is a specific
sequence of two bytes, chosen for a reason.” Therefore, the return
value of htons should be an aggregate of two bytes.</p>
<p>Again, because of pointer decay this is impossible to express
straight-forwardly in C, although a wrapper struct could be used.
C++ takes care of this by having a built-in wrapper struct for
arrays, namely <code>std::array</code>. The equivalent of <code>htons</code> would
not emphasize that the <code>uint16_t</code> is in the host order (which I
think is the wrong way of thinking about it), but would simply
indicate that we’re just storing this short in a big-endian
fashion (as opposed to the hardware-supported default storage
we can access with a <code>memcpy</code>):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>array<span style="color:#f92672"><</span><span style="color:#66d9ef">uint8_t</span>, <span style="color:#ae81ff">2</span><span style="color:#f92672">></span> store_short_as_big_endian(<span style="color:#66d9ef">uint16_t</span> value);
</span></span></code></pre></div><p>Rust already provides this as an alternative:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> <span style="color:#66d9ef">u16</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">to_be_bytes</span>(self) -> [<span style="color:#66d9ef">u8</span>; <span style="color:#ae81ff">2</span>] {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Unfortunately for semantics, Rust still has the problematic
signature for <code>to_be</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> <span style="color:#66d9ef">u16</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">to_be</span>(self) -> <span style="color:#66d9ef">u16</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Perhaps this is due to efficiency reasons, or felt efficiency.
Programmers know that this byteswapped value should, for performance,
be stored in a single register. Programmers can feel more confident
that this is actually done if it remains a <code>u16</code> (or <code>uint16_t</code>) than
if it is transformed into an array of bytes, however semantically
inappropriate the <code>u16</code> is.</p>
<p>However, if we are using a <code>u16</code> or <code>uint16_t</code> as an implementation layer
for what is in fact a way of storing two bytes in the opposite order
than the one that makes sense for our processor, if we are using it as
an implementation trick to do something semantically different from what
a <code>uint16_t</code> normally does, then we should at least make the type distinct
to give the maintenance programmer and compiler some ability to avoid
letting us do non-sensical things (like comparing the value using
<code>uint16_t</code>’s comparison operator).</p>
<p>Luckily, there is a design pattern for using the implementation of a
type, but applying different semantics to it: the newtype pattern. We
typically think of it as a Haskell or Rust thing, but we can use it in
C++ as well. I would argue that if we’re going to abuse <code>uint16_t</code>s
and friends in such a way, we should at least abstract it using the
newtype pattern. In C++, this would look something like this, assuming
a little endian computer:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">template</span> <span style="color:#f92672"><</span><span style="color:#66d9ef">typename</span> T<span style="color:#f92672">></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">big_endian</span> {
</span></span><span style="display:flex;"><span> T value;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> big_endian() <span style="color:#f92672">=</span> <span style="color:#66d9ef">default</span>;
</span></span><span style="display:flex;"><span> big_endian<span style="color:#f92672">&</span> <span style="color:#66d9ef">operator</span><span style="color:#f92672">=</span>(<span style="color:#66d9ef">const</span> big_endian<span style="color:#f92672">&</span>) <span style="color:#f92672">=</span> <span style="color:#66d9ef">default</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> big_endian(T in) {
</span></span><span style="display:flex;"><span> <span style="color:#f92672">*</span><span style="color:#66d9ef">this</span> <span style="color:#f92672">=</span> in;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> big_endian<span style="color:#f92672">&</span> <span style="color:#66d9ef">operator</span><span style="color:#f92672">=</span>(T in) {
</span></span><span style="display:flex;"><span> value <span style="color:#f92672">=</span> std<span style="color:#f92672">::</span>byteswap(in);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#f92672">*</span><span style="color:#66d9ef">this</span>;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">operator</span> <span style="color:#a6e22e">T</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> std<span style="color:#f92672">::</span>byteswap(value);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>Adding appropriate <code>if constexpr</code> expressions to also support
big endian machines, and defining <code>std::byteswap</code> if you don’t
have it yet on your system is left as an exercise to the reader.</p>
<p>But it works on my (little endian) system:</p>
<pre tabindex="0"><code>int main() {
big_endian<uint16_t> be = 80;
std::array<uint8_t, 2> be_bytes;
memcpy(be_bytes.data(), &be, 2);
printf("%04X\n", uint16_t(be));
printf("%02X %02X\n", be_bytes[0], be_bytes[1]);
return 0;
}
</code></pre><p>I would much rather use this to represent “we want to store a value
in a register byte-swapped on some platforms” than a <code>uint16_t</code> with
no additional type information. You cannot accidentally run invalid
<code>uint16_t</code> operators on it, but you can convert it to a normal <code>uint16_t</code>
first and then use those operators. However, it does have a big endian
representation when stored, as indicated by the <code>memcpy</code>, and it can still
be stored in a single register.</p>
<p>Even so, I would still not prioritize that ability to store it in a
single register in most situations. Using a <code>uint16_t</code> to store the
bytes swapped is still not remotely “storing a big endian value in a
<code>uint16_t</code>,” it is “storing a big endian representation in a <code>uint16_t</code>
so that when the processor writes that <code>uint16_t</code> little endian, we get
a big endian representation of the number we actually want.” It’s still
fundamentally a hack for performance, and while I’m comfortable with
it contained within the encapsulation of this <code>little_endian</code> class,
I would still rather actually write <code>std::array<uint8_t, sizeof(T)></code> as
the underlying storage type, unless the optimization is actually needed.
I actually would use a <code>big_endian</code> class that would look more like this:</p>
<pre tabindex="0"><code>template <typename T>
class big_endian {
std::array<uint8_t, sizeof(T)> be_representation;
static void swap_array(std::array<uint8_t, sizeof(T)> &arr) {
for (auto it = arr.begin(), jt = arr.end() - 1;
it < jt;
++it, --jt) {
std::swap(*it, *jt);
}
}
public:
big_endian() = default;
big_endian& operator=(const big_endian&) = default;
big_endian(T in) {
*this = in;
}
big_endian& operator=(T in) {
memcpy(be_representation.data(), &in, sizeof(T));
swap_array(be_representation);
return *this;
}
operator T() {
auto bytes_copy = be_representation;
swap_array(bytes_copy);
T out;
memcpy(&out, bytes_copy.data(), sizeof(T));
return out;
}
};
</code></pre><p>This now feels like I’m actually representing accurately what a big
endian representation is: a way of storing a number as a sequence
of bytes, rather than however the processor feels like storing it,
and certainly rather than as a value that the processor will store
as little endian, but which will store the value we actually want
to store as big endian. I won’t lie and say the optimizer will
make it equally performant, and if I needed to actually optimize
I would use the other version, but I feel like this version is hack-free.
(Again, it still only works on little endian platforms – fixing this
is again left as an exercise.)</p>
<p>This version has the added benefit of having an alignment of 1, which I
will argue later is more appropriate than using the underlying alignment
of <code>uint16_t</code>, <code>uint32_t</code>, etc.</p>
<h1 id="using-these-big-endian-types">Using These “Big Endian” Types</h1>
<p>This leads to a further question, however: When do we need to support
network byte order? Really, the only time is when generating messages
in wire format to send over the network. In C and C++, we generally
represent messages to be sent over the network as <code>struct</code>s.</p>
<p>For example, one can imagine a packet format with a 32-bit sequence
number. We would want to write <code>uint32_t</code> for this sequence number:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>__attribute__((packed))
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">packet_wire_format</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint8_t</span> from_device;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint8_t</span> to_device;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint32_t</span> sequence_number;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>However, of course, if it is in big endian byte ordering (as many
protocols are), we then have to call <code>htonl</code> when loading this value
in:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>packet_wire_format packet;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">uint32_t</span> seq_num <span style="color:#f92672">=</span> current_seqnum<span style="color:#f92672">++</span>;
</span></span><span style="display:flex;"><span>packet.sequence_number <span style="color:#f92672">=</span> htonl(seq_num);
</span></span></code></pre></div><p>As I said before, I don’t like <code>htonl</code>. I certainly don’t like
using <code>uint32_t</code> as the type for <code>sequence_number</code>. So, we can do one of
two things:</p>
<ul>
<li>We can use a Rust-style function to convert to byte representation,
and use <code>std::array<uint8_t, 4></code> as the type of <code>sequence_number</code>.
This strikes me as equally awkward. We now know that we need to do
soemthing other than just assign the value, but we don’t know what
that thing is, necessarly.</li>
<li>We can make the type more semantic, and use our <code>big_endian</code>
wrapper. This is the purpose why I wrote it, and the use case where
it makes sense it has an alignment of 1 – wire format structures
are often packed.</li>
</ul>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>__attribute__((packed))
</span></span><span style="display:flex;"><span><span style="color:#75715e">// ^^ You may need to add this to `little_endian` as well,
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">// or you may not need it at all now
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">packet_wire_format</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint8_t</span> from_device;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint8_t</span> to_device;
</span></span><span style="display:flex;"><span> big_endian<span style="color:#f92672"><</span><span style="color:#66d9ef">uint32_t</span><span style="color:#f92672">></span> sequence_number;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Now, when we actually send it over the wire, we will cast or copy
this <code>packet_wire_format</code> to get the byte-by-byte representation,
and <code>sequence_number</code> will be in big endian, by the invariants
of our <code>big_endian</code> class. We will not need to remember to call
any function at all, as the class’s interface provides us with
only appropriate options:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>packet_wire_format packet;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">uint32_t</span> seq_num <span style="color:#f92672">=</span> current_seqnum<span style="color:#f92672">++</span>;
</span></span><span style="display:flex;"><span>packet.sequence_number <span style="color:#f92672">=</span> seq_num; <span style="color:#75715e">// Performs conversion
</span></span></span></code></pre></div><p>The fewer mistakes you can make by accident, the better.
And of course, this has the additional advantage that the
type of the wire format is more self-documenting.</p>
<p>Similarly, if you read or write from the wire format using
read and write methods on a buffer type, those methods should
either be parameterized to take endian information along with
the values, or you can pass objects of type <code>big_endian</code>
as the value to be copied in: <code>big_endian<uint32_t></code> is just
as trivially-copyable as <code>uint32_t</code>.</p>
<h1 id="conclusions-and-loose-ends">Conclusions and Loose Ends</h1>
<p>It is a little more awkward to write <code>big_endian</code> for Rust.
I would want to use the existing <code>to_be_bytes</code> method in the
implementation, and unfortunately that method is not in any
trait, as I’ve <a href="https://www.thecodedmessage.com/posts/endian_polymorphism/">complained about before</a>.
This can easily be remedied by writing our own trait, however,
or using external crates that already do so.</p>
<p>However, I wonder if maybe all of these languages should define types
that correspond to <code>uint16_t</code>, <code>uint32_t</code> etc, and just are defined to
store themselves in network byte order (and perhaps another one that
guarantees little endian order). After all, most processors support
byteswap instructions, that make writing a value as a byteswap an easy
operation. They could be optimized as normal values unless actually
written to memory – and only the optimizer knows when they’re actually
written to memory. They could even be written to memory in native
endianness unless there’s some defined way to get a byte-by-byte pointer
to them – and really only the optimizer knows that.</p>
<p>Endianness seems more a configuration on the natural types of the
programming language than it does something to be implemented on
top of these natural tools. These loops I’m using to do byteswaps
are surely not the most efficient way to do it (which is why the
non-array based implementation of <code>big_endian</code> is surely more
performant even if it is hackish), because processors have some
support for non-native endianness baked in. If a C++ vendor
provided types like <code>big_endian</code> (and perhaps some do, I’m sure I’ll
find out in the comments) it would surely be more performant.</p>
<p>But again, perhaps they should be primitive types. There’s some built-in
processor support for them, and only the optimizer knows when the
non-native endianness actually should be used.</p>
<p>I am too busy a person to do the research for such a proposal. I don’t
know if such a proposal exists. My interest here is simply in using the
tools I have to be a good programmer. For that, <code>to_be_bytes</code> and
my implementation of <code>big_endian</code> will simply have to suffice.</p>
Operating Systems: What is the command line?https://www.thecodedmessage.com/posts/command-line/2023-10-08T00:00:00+00:00This is my newest post in my series about operating systems. Yes, it was last updated in 2019 – I’m a hobbyist blogger. This is a post about the command line, a computer topic, but it is for educating a non-technical (but tech-curious) audience. Most of the programmers in my audience will already know everything I have to say, and may be bored by some explanation of things they already know, though I intend to discuss some technical details of how computers work.<p>This is my newest post in my series about <a href="https://www.thecodedmessage.com/tags/operating-systems/">operating
systems</a>. Yes, it was last updated in 2019 –
I’m a hobbyist blogger. This is a post about the command line, a computer
topic, but it is for educating a non-technical (but tech-curious)
audience. Most of the programmers in my audience will already know
everything I have to say, and may be bored by some explanation of things
they already know, though I intend to discuss some technical details of
how computers work.</p>
<p>This is <strong>not</strong> a tutorial on how to use the command line on any
particular operating system. Rather, it is a discussion of the role that
a command line plays in a modern operating system and why some people
(including me) still use that kind of interface.</p>
<p>As I’ve <a href="https://www.thecodedmessage.com/posts/org4-desktop/">explained before</a>, I often use my computer
through the command line. It is a major part of but not the entirety of how
I interact with it. I do this so much that people looking at my computer
will assume I’m programming even when I’m not – even when I’m working on
my blog, or another writing project, or even just organizing my pictures.</p>
<p>Here is a screenshot of a command line session:</p>
<p><img src="https://www.thecodedmessage.com/cli.png" alt="Command Line Screenshot"></p>
<h1 id="graphical-user-interfaces">Graphical User Interfaces</h1>
<p>This is (as you likely know since you’re reading this on a website) no
longer the normal way to interact with computers. Nowadays, we usually
interact with computers through <em>graphical user interfaces (GUIs)</em>, and
many people take them for granted. We access applications<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> through
each having their own window – or, for web applications, we can combine
them into one window via browser tabs.</p>
<p>We navigate these applications through the mouse, or touchpad. Scrolling
and clicking to find our way through the document, right-clicking or
navigating menus to find further options, and occasionally interacting
with a “dialog box” to specify details. All features are expected to
be <em>discoverable</em>, that is to say, we expect to be able to find them
in a menu, a toolbar, a right-click menu, or by navigating the dialog
boxes we reveal through these other things. If we cannot discover
a feature by these mechanisms, we can reasonably assume the application
does not have this feature.</p>
<p>Here is LibreOffice Calc, a (somewhat old-fashioned) GUI program:</p>
<p><img src="https://www.thecodedmessage.com/lo-calc.png" alt="LibreOffice Calc Screenshot"></p>
<p>Nowadays, applications often run inside web browsers. This principle
of discoverability is still considered important. Here is Google Docs,
an application running inside a web browser:</p>
<p><img src="https://www.thecodedmessage.com/google-docs.png" alt="Google Docs Screenshot"></p>
<p>These are both mouse-navigated programs with discoverable features. For
both of these applications, there are many visible ways to interact
with them. If you want to find a feature, looking through what’s right
in front of you is the way to go.</p>
<h1 id="the-command-line-in-brief">The Command Line in Brief</h1>
<p>The command line works differently.</p>
<p>Nowadays, the command line is usually accessed via a window within
the context of a graphical <em>desktop environment</em><sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup>, but in the olden
days, people interacted with computers via dumb terminals that couldn’t
display images, just text<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>:</p>
<p><img src="https://www.thecodedmessage.com/dumb-terminal.png" alt="Dumb Terminal"></p>
<blockquote>
<p>“It was a dummy terminal, and I was a dummy user.”</p>
<ul>
<li>A member of the Baby Boomer generation describing what it was like to
be a person in a non-IT role using Unix in the 80s.</li>
</ul>
</blockquote>
<p>Instead of being able to find various features via menus visible
on the screen, you are instead given a <em>prompt</em>, an indication
of the current state of your session that is, well, prompting you
to tell the computer what to do, to give it a command:</p>
<p><img src="https://www.thecodedmessage.com/prompt.png" alt="Prompt"></p>
<p>You can then type your command, maybe a few more.</p>
<p><img src="https://www.thecodedmessage.com/cli.png" alt="Command Line Screenshot"></p>
<p>As you type commands, the output of the commands displays on the
subsequent lines. When you hit the bottom of the screen, the screen
scrolls up. Most terminal emulators let you scroll the window to see
earlier parts of the transcript. A command might also prompt for additional
input, or take full control of the terminal emulator and provide a different
type of (still text-based) interface entirely.</p>
<p>If you type a bad command, it is not very helpful:</p>
<p><img src="https://www.thecodedmessage.com/bad-command.png" alt="Bad commands"></p>
<p>There is no discoverability. There are no hints as to what commands might
be accepted. You can use the command line to find out more information
about what commands are accepted, but you have to know the commands to do
that. In practice, you have to learn a minimal set of commands from a book
(or nowadays, a website) before you can actually do anything productive.</p>
<p>It’s not intentionally user-unfriendly. For example, on Linux, there
are commands like <code>man</code> (for “manual”) that explain what commands do,
and commands like <code>apropos</code> to search for useful commands. Here is the
manual page for the <code>man</code> command itself:</p>
<p><img src="https://www.thecodedmessage.com/man.png" alt="Man Manual"></p>
<p>Additionally, once you know the name of a command or utility, you
can generally find out more about how to use it by passing <code>-?</code> or
<code>--help</code>:</p>
<p><img src="https://www.thecodedmessage.com/apt-help.png" alt="Apt Help"></p>
<p>Command lines are available on all modern operating systems for personal
computing: Windows, macOS, Linux, and certainly any other Unix you might
have running. They tend not to be available on mobile OSes.</p>
<h1 id="what-is-the-command-line-not">What is the command line not?</h1>
<p>Before we talk about what this is for, and why modern operating systems
still support this decidedly old-fashioned way of interacting with them,
I want to dispel some myths and misconceptions about the command line,
specifically two opposite misconceptions that seem to still be common
amongst the computer laity.</p>
<blockquote>
<p><strong>Misconception One:</strong> The command line is literally DOS, the
Microsoft operating system from the 80’s and early 90’s.
It is there to support old programs from the 80’s and early 90’s,
and exists solely for the support of obsolete and obsolescent software.</p>
</blockquote>
<p>This misconception is common among Windows users, because it used to be
true. Until Windows XP, Windows still came bundled and intertwined with
a version of Microsoft’s older, fully command-line operating system,
DOS. Old DOS programs were still in common use, and people needed a way
to run them, so they could run a copy of DOS inside a window.</p>
<p>It’s not true anymore, however. Windows is no longer a chimera of DOS
and more modern components. Since Windows XP, both the consumer and
business versions of the Windows brand have been versions of <a href="https://en.wikipedia.org/wiki/Windows_NT"><em>Windows
NT</em></a>, a different operating
system from earlier consumer versions of Windows, one originally targeted
at business users, with no DOS code in it at all.</p>
<p>On a modern Windows computer, the command line is not primarily for
DOS programs. The ability to run DOS programs isn’t even shipped with
Windows by default anymore, but the command line still is. The confusion
is understandable, because the command line still <em>looks like</em> the DOS
command line. The prompt is still a form of DOS’s famous <code>C:\></code>.</p>
<p>What is the command line for, then? It is for running modern Windows
programs that happen to be designed to be used from the command
line. Windows comes with a bunch of such programs, for things like systems
and network administration.</p>
<p>There are a bunch more that you can download install, usually tools
written by computer professionals for other computer professionals. Many
of these command line programs were written primarily for Linux and
other Unix OSes, but also have Windows versions.</p>
<p>We will go into specific examples of command line programs in a later
section, but the important thing to know is that a command line program
has access to all the same system libraries and capabilities that any
Windows (or Linux, or macOS) program can access. It can play audio,
connect to the Internet, and do pretty much anything – anything except
draw a new window on the screen, not because it can’t, but because that
would make it not a command line program anymore.</p>
<p>But I don’t want to go too over-the-top rebutting this first misconception,
because then I might lead you to believe the second misconception.</p>
<blockquote>
<p><strong>Misconception Two:</strong> Not only can you do anything from the command
line that you can do from a graphical user interface, but the command
line is fundamentally closer to the operating system. When graphical
programs run, they are using the command line under the hood.</p>
</blockquote>
<p>This is not true.</p>
<p>It should be obvious that there is at least one thing you can do
from a graphical user interface that you can’t do from the command
line, which is to display graphics. The command line is an interface
based fundamentally on displaying a grid of text. Thanks to <a href="https://xkcd.com/1953/">modern
Unicode</a>, “text” now includes “emojis,” but it
does not include images or high-quality charts and graphs.</p>
<p>But even with that overly-obvious caveat aside, yes, it is true that
anything a graphical program can do besides show graphics could be done
by a command line program as well. There are command line programs that
manipulate images, they just don’t show the images as they manipulate
them. There are command line programs that pretend to be web browsers
and scrape data off of the websites when they load. All the operating
system features and computer resources that graphical programs have at
their disposal, command line programs will generally have too, besides
(by definition) actually doing graphical displays and interactions.</p>
<p>However – and this is a big however – just because a command line
program could exist to do everything a graphical program does,
doesn’t mean that you have that program installed on your system,
or that someone’s even ever written that program. The capabilities
of your computer depend on what software you have installed, and what
software you can install depends on what software people have written.
If someone creates a file format, but only writes a GUI program
to edit it, well, then, until someone reverse-engineers it, that file
format will only be editable via GUI. Similarly if they only create
command line tools – that file format will then only be accessible
by command line.</p>
<p>For example, someone with <a href="https://imagemagick.org/index.php">ImageMagick</a>
installed on their computer but not Photoshop may only be able to do image
manipulation from the command line. Someone with Photoshop installed but
not ImageMagick may only be able to do image manipulation from the GUI.
There is nothing intrinsically more powerful about either interface.</p>
<p>Specifically, GUI programs are decidedly not wrappers around command
line utilities. You could write a GUI program that way (and there
are a couple that are), but the vast majority do not in fact do this.
Just as command line programs have access to all the same computer resources
and operating system functionality that GUI programs do, it also works
the other way around. GUI programs and command line programs both are
written in programming languages that allow the program to invoke
operating system functionality through system libraries and system calls.
These calls are not at all the same as command line commands, and
the GUI doesn’t need to use the command line as an intermediate layer.</p>
<p>If there is a GUI version and a command line version of the same
functionality, maybe this is implemented as the GUI version launching the
command line version under the hood – that is certainly something GUI
programs <em>can</em> do, and it might make sense if the command line version is
the interface most people use and that most maintainers are interested
in. But it is just as likely if not more likely to be implemented by
the GUI program and the command line both using the same common library.</p>
<p>And certainly, GUI-only programs like web browsers, e-mail clients,
and office suites do not by any means implement their functionality by
wrapping command line programs. There is no command line version of
or interface to Photoshop, nor of Microsoft Word<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup>.</p>
<p>And just like it’s possible to have an operating system with
a command line and no graphical user interface, it is possible
to have an operating system with a graphical user interface and no
command line, not even internal analogues of it.</p>
<h1 id="history-of-the-command-line">History of the command line</h1>
<p>As I said before, computers used to be frequently accessed
via dumb terminals. Before this, they were accessed by
<a href="https://en.wikipedia.org/wiki/Teleprinter">teletypewriters</a>. This was
literally a typewriter, where the keys you entered went to the computer,
and the computers responses were typed on the paper.</p>
<p>Modern command lines mostly follow that pattern – new input goes in
at the bottom of the window, and the window scrolls like a piece of paper
receding from the typewriter. But on a modern command line, the program
can also take over the entire terminal emulator window, as long as what
it wants to draw can be expressed as text. They even support multiple
colors.</p>
<p>Most command line systems used today, like most operating systems used
today, descend from the Unix tradition, written in 1970. The exception
is Windows – even though the Windows command line is not DOS, it takes
many of its aesthetic principles from DOS, not only the famous prompt
<code>C:\></code>, but also its habit of taking options with <code>/</code>, where Unix and
friends use <code>-</code>.</p>
<h1 id="what-are-some-modern-command-line-programs">What are some modern command line programs?</h1>
<ul>
<li><a href="https://git-scm.com/"><code>git</code></a> keeps track of different versions of
a large folder (called a <em>repository</em>) full of code or other forms of
(mostly) text, and allows changes to be merged and reconciled between
different authors. While there are GUI and web wrappers around it,
the flagship program is a command line utility.</li>
<li><a href="https://www.openssh.com/"><code>ssh</code></a> lets you log into a command line
interface of another computer, usually a server. This is often the only
way to log into and administrate the server, as Linux servers generally
don’t have any GUI capabilities or GUI programs installed.</li>
<li><a href="https://imagemagick.org/index.php">ImageMagick</a> lets you manipulate
images.</li>
<li>Last but not least, there are many small programs that let you do
basic file management, searching, and editing. Two of my favorite
new ones are <a href="https://github.com/BurntSushi/ripgrep">RipGrep</a>
by <a href="https://blog.burntsushi.net/">Andrew Gallant</a> (which lets you
search for strings or patterns in text files)
and <a href="https://github.com/sharkdp/fd">fd</a> by <a href="https://david-peter.de/">David
Peter</a> (which let you search for files by
name or other properties).</li>
</ul>
<h1 id="why-use-the-command-line">Why use the command line?</h1>
<p>If you are new to a tool, discoverability is an important
feature. If you are experienced with a tool, all the hints of
where to find things are more distractions than they are useful.</p>
<p>As someone who needs all the focus that I can get<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup>, distractions
are bad. And so are extra steps: Why spend the time moving the mouse
around to access one menu, then another, when on the command line,
I can just type the command I already know for what I need to do.</p>
<p>Additionally, the command line is designed to save on extra typing.
Generally, most modern command lines support “tab completion,” where you
can type the beginning of the command, or a file that it’s operating on,
and press the <strong>[TAB]</strong> key, the command line interpreter will complete
the word for you – or list the possibilities if there are multiple.</p>
<p>For a newbie, it might be an intimidating, but for someone who’s used to
it, it stays out of your way and lets you get stuff done – while showing
you a detailed transcript of what you’ve been doing, in case you forget
what exactly it was you were trying to do.</p>
<p>Command lines are even more important on the server. While Windows servers
come with a graphical user interface you can remote login into, Unix<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup>
servers generally don’t. It’s more efficient to just allow administrators
a command line interface – and for most server administrators, it’s
quite enough.</p>
<p>And while command lines are not closer to the operating system in a
deep technological sense, they are closer to the operating system
by convention. They tend to have all the options that a power user
would want – and easy ways to specify them, rather than hiding them
behind multiple warning signs and buttons labelled “Advanced….”</p>
<p>Last but not least, if you have a series of GUI actions that you often
do, you usually have to just keep doing them, even if it’s very tedious.
Precious few programs let you do something like write a shortcut key
for five menu commands. On the command line, however, you can use
aliases or scripts, where a short command stands for a long command,
or a single command stands for a whole sequence of commands. You just
put into a file the same text you would type at the prompt.</p>
<h1 id="how-does-the-command-line-actually-work">How does the command line actually work?</h1>
<p>Generally, a terminal emulator or command line window has a process
running in it that presents the prompt (<code>C:\></code> or similar on Windows,
normally something ending with <code>$</code> on Unix). It then takes in the
command, takes the first word, and runs that as a program.
This program is launched as a separate process, just like clicking
on a program icon launches a separate process in a graphical user
interface. The shell waits in the background for the process to finish,
and then presents a new prompt. On a modern multitasking operating
system, the shell generally also allows you to run commands in the
background, and use key combinations (Ctrl-Z on Unix) to put a process
in the background, and commands like <code>fg</code> to bring processes back
to the foreground. This allows you to run multiple programs at once
within the terminal.</p>
<p>On Linux, when a program starts, it conventionally has three
open files, 0, 1, and 2, for input, output, and error, respectively.
On the command line, by default (for it is configurable), these
all correspond to the terminal: input is read in from the keyboard
on the terminal (by default line by line), and output and errors
are outputted to the terminal. GUI programs will have these three
files open when they start too, but unless they’re started from the
terminal, the output will normally just silently be ignored.</p>
<p>The program can also draw a window, if a graphical environment is
available. On Linux, it is easy for the same program to have a command
line interface, and a graphical interface – sometimes at the same time.
This is useful if it’s mostly used from the command line, but sometimes
also wants to do things like show a chart or graph that can be generated.</p>
<p>macOS and Windows have more complicated GUI frameworks that make a GUI
application more different in structure from a command line operation,
but you can still launch GUI applications from the command line.</p>
<h1 id="footnotes">Footnotes</h1>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>An
<a href="https://en.wikipedia.org/wiki/Application_software"><em>application</em></a>
is just a computer program that does a task besides making the computer
system work as a whole, a task interesting to the user. Examples include
word processors, spreadsheets, chat apps, and video games. It’s not
so much a rigorous technical term as an amorphous category of software. <a href="#fnref:1" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:2">
<p>A desktop environment, also known as a graphical shell, is
a graphical user interface for managing the windows you have open,
and providing computer-wide menus for launching applications. It also
controls the root window, which is what you see when you have no windows
open, normally used for shortcuts and files you’re currently working
on. Windows and macOS both provide their own desktop environments,
which generally aren’t mentioned by name – they are just part of
the operating system. Linux and most other Unixes, when they have
graphical interfaces at all, can be used with a variety of different
desktop environments. <a href="#fnref:2" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:3">
<p>This image is taken from <a href="https://commons.wikimedia.org/wiki/File:DEC_VT100_terminal_transparent.png">Wikimedia
Commons</a>.
It is by Jason Scott, and available under <a href="https://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA
4.0</a>. It was modified
by the Wikimedia poster by removing the background. <a href="#fnref:3" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:4">
<p>Oddly enough, most web browsers support running without the browser
window actually being displayed, in a headless mode. This is generally not
usable purely from the command line, but in the context of being wrapped
in a larger program (which might be a command line program). Additionally,
Microsoft Word and Photoshop can be programmatically controlled – they
are both <em>scriptable</em> – but as far as I know neither Microsoft nor Adobe
have chosen to provide a command line interface to this functionality,
even though they could. Again, it’s about what’s actually available on
your computer. <a href="#fnref:4" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:5">
<p>It has been said that I have a <a href="https://www.thecodedmessage.com/tags/adhd/">deficit of attention</a>. <a href="#fnref:5" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:6">
<p>I use Unix in a broad sense to include Unix-like operating
systems like Linux and the BSDs, even if they aren’t Unix in a
trademark sense. <a href="#fnref:6" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
</ol>
</div>
Can computers think things?https://www.thecodedmessage.com/posts/can-computers-think-things/2023-09-30T00:00:00+00:00This blog post isn’t about ChatGPT. It isn’t about machine learning, neural nets, or any mysterious or border-line spiritual form of computing. That’s a whole ’nother set of philosophical and metaphysical conundrums (conundra?).
This is about a way people sometimes speak, informally, about bog-standard boring non-AI computers and computer programs. You’ve probably heard people speak this way. You’ve probably spoken this way sometimes yourself:
“The server thinks your password is wrong.<p>This blog post isn’t about ChatGPT. It
isn’t about machine learning, neural nets, or any
<a href="https://www.smbc-comics.com/comic/conscious-6">mysterious</a> or border-line
<a href="https://www.smbc-comics.com/comic/2009-10-08">spiritual</a> form of
computing. That’s a whole ’nother set of philosophical and metaphysical
conundrums (<a href="https://en.wiktionary.org/wiki/conundra">conundra</a>?).</p>
<p>This is about a way people sometimes speak, informally, about bog-standard
boring non-AI computers and computer programs. You’ve probably
heard people speak this way. You’ve probably spoken this way sometimes
yourself:</p>
<ul>
<li>“The server thinks your password is wrong.”</li>
<li>“The computer thinks you’ve lost the connection.”</li>
<li>“The phone thinks you want to use your headphones. It’s wrong though.”</li>
</ul>
<p>We normally interpret this as a metaphor, but I’m not sure it is.
Is the phone “thinking” you want to use your headphones rather than
your car speaker substantially different from us “thinking”
our friend would rather get a phone call than a text message?</p>
<p>Part of the problem here is that the word “think” in English can
mean different things.</p>
<p>It can mean to cognate, to go through a rational series of propositions
in our brains, expressed as internalized speech in our mind’s ear or
diagrams in our mind’s eye or pure abstractions. “I am thinking about
how to approach this physics problem.” Computers probably cannot do this,
and certainly are nowhere as good at it as humans are, not even with
this fancy new AI software everyone’s playing with.</p>
<p>But it can also mean to have a belief, a mental model about reality.
“I think Joe doesn’t like me very much.” Or, “I think the reason
the car won’t start is because the battery is dead.” Computers,
I will argue, can do something remarkably similar to humans in this
category.</p>
<p>Some languages distinguish these two meanings of “think.” English learners
of German often say <em>denken</em> (to cognate), when they mean <em>glauben</em>
(to believe), in contexts where both would translate as “to think.” And
then, in case that was too simple, there’s also <em>meinen</em>, which means
“to suppose” or “to opine,” also used when English speakers might say
“to think.”</p>
<p>So here’s my thought on this, or rather, my opinion (<em>meine Meinung</em>):</p>
<blockquote>
<p>Computers cannot yet <em>denken</em>, or cognate, like humans. But computers
can definitely <em>glauben</em>, or internally believe, specific facts, and
they’ve been able to do that since the day they were invented.</p>
</blockquote>
<p>In order to figure out whether this is true, we first need to establish
what it means to believe something, and then see if computers can do it.
What does it mean for humans to think something, to believe something
about the world? Can we extract a definition that can then be applied
to computers, to see whether computers are capable of the same thing?</p>
<p>So, what does it mean for us to think something is true? Well, it means
that we have some internal state, some internal information stored in
the physical arrangement of our brains, that corresponds to that thought
or belief. We then use that internal state to inform our behavior. If we
think our friend would rather get a phone call than a text message, then
we might choose to accomodate that and call them instead of texting them.</p>
<p>This internal state, when all is going well, corresponds to a specific
external reality. The goal is for the internal state to match the
external reality. Sometimes this goal is not met – sometimes we
misapprehend the situation, our belief is wrong, or what we think is true
is not true. But if we are wrong, we have the same internal state as
we would have if we were right, and things were working.</p>
<p>We can therefore define believing or thinking that a proposition X is true
thus:</p>
<blockquote>
<p>A being believes X is true if they have an internal state that,
when the being is functioning correctly, corresponds to X being true,
that then informs their behavior such that it is the behavior that
makes sense if X is true, rather than the behavior that makes sense
if X is not true.</p>
</blockquote>
<p>Applied to phone example, we have some internal state in our brain that
indicates that “Jill would rather get a call than a text.” How do we
know that the state indicates that proposition? Well, we know that when
our brains are functioning correctly (a hard thing to define, but also
a concept everyone uses all the time), we only have that internal state
when the proposition is true. And, we also know that this internal state
drives behavior consistent with that proposition being true. Assuming we
want to accommodate Jill’s preference, we will call her instead of texting
her, an adaptive decision if the belief is true, and a non-adaptive one
if the belief is false.</p>
<p>With this framework, it seems almost easier to establish that computers
can think something is true than that humans can do this. Humans often
have complicated, ambivalent beliefs and thoughts. Humans will often
believe something for reasons other than an efficient assessment of its
truth value, and act contrary to their own earnestly held beliefs. I think
this definition still works for humans, if you take all the confounding
factors into consideration, but it’s hard: We get into things like
“conscious” or “subconscious” beliefs, or “he says he thinks X, but
his actions show he really thinks Y.” And, of course, it’s extremely
difficult to define whether a human is “functioning correctly.”</p>
<p>With computers, however, they think all sorts of things.
For example, let’s talk about whether a computer thinks a
user has administrator privileges. You might see code like
this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> has_admin_privileges: <span style="color:#66d9ef">bool</span> <span style="color:#f92672">=</span> is_admin(conn.get_current_user());
</span></span></code></pre></div><p>Now, we have an internal state in the computer, a boolean (i.e. true
or false) variable that is intended to correspond to whether the user
has administrator privileges. If the code is functioning correctly,
this variable will take on the value <code>true</code>. We know this, because
the definition of “functioning correctly” is implicit in the way the
programmer wrote the code, and how they named the variable.</p>
<p>Furthermore, the following lines of code are almost certainly
behaviors in line with that interpretation of the internal state.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">if</span> has_admin_privileges {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Do the thing
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> requested_task.perform()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Signal success
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> Ok(())
</span></span><span style="display:flex;"><span>} <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Signal an error
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> Err(Error::AccessDenied)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>So, when people say things like “my phone thinks I want to use
my Bluetooth headphones,” it means that there is information encoded
in the silicon of the phone, possibly in an explicitly-named
variable, that corresponds to that belief.</p>
<p>So now that I’ve thought this through properly, I don’t even think
statements like this are metaphorical. I think they are literally true,
and completely appropriate.</p>
Verbal Ticshttps://www.thecodedmessage.com/posts/verbal-tics/2023-08-31T00:00:00+00:00I remember hearing an idea once – I’d like to cite it, but proper citation seems difficult, as I heard it from an acquaintance, and Mr. Google isn’t being his usual helpful self. The idea was, different politicians have these verbal tics, these filler catch-phrases, that indicate their deepest conversational anxieties.
For President Obama, it’s “let me be clear.” According to this thesis, he is really concerned about being unclear, and this tic is so prominent in his speech that it shows that his biggest anxiety is being insufficiently clear about something, as waffling, or evading the deep issue underlying all the petty concerns.<p>I remember hearing an idea once – I’d like to cite it, but proper
citation seems difficult, as I heard it from an acquaintance, and
Mr. Google isn’t being his usual helpful self. The idea was, different
politicians have these verbal tics, these filler catch-phrases,
that indicate their deepest conversational anxieties.</p>
<p>For President Obama, it’s “let me be clear.” According to this
thesis, he is really concerned about being unclear, and this
tic is so prominent in his speech that it shows that his
biggest anxiety is being insufficiently clear about something,
as waffling, or evading the deep issue underlying all the petty
concerns. And as an American paying some amount of attention,
this made sense to me.</p>
<p>For President Trump, the tic under discussion (for he has many)
was “believe me.” President Trump was concerned about being called
out as a liar, because he was.</p>
<p>And when this discussion came up, I realized that my biggest
verbal tick in conversation was “if that makes sense,” or the
question form, “does that make sense?” And I realized that
I did have anxiety that underlies this verbal tic, a deep
suspicion that everything I’m saying is so befuddled and so
indirectly and subtly put that it doesn’t make sense to the
listener.</p>
<p>Does that make sense?</p>
My Dream C++ Additionshttps://www.thecodedmessage.com/posts/c++-additions/2023-08-30T00:00:00+00:00UPDATE: I have updated this post to address C++ features that address these issues or have been purported to.
I have long day-dreamed about useful improvements to C++. Some of these are inspired by Rust, but some of these are ideas I already had before I learned Rust. Each of these would make programming C++ a better experience, usually in a minor way.
Explicit self reference instead of implicit this pointer UPDATE: This is coming out in C++23, and they did it right!<blockquote>
<p><strong>UPDATE</strong>: I have updated this post to address C++ features that
address these issues or have been purported to.</p>
</blockquote>
<p>I have long day-dreamed about useful improvements to C++. Some of these
are inspired by Rust, but some of these are ideas I already had before
I learned Rust. Each of these would make programming C++ a better experience,
usually in a minor way.</p>
<h1 id="explicit-self-reference-instead-of-implicit-this-pointer">Explicit <code>self</code> reference instead of implicit <code>this</code> pointer</h1>
<blockquote>
<p><strong>UPDATE:</strong> This is coming out in C++23, and they did it right!
I’m excited! Good job C++!</p>
<p>I admit I haven’t been paying close attention to C++ post C++14.
C++17 was up-and-coming and I hadn’t finished learning everything
I wanted to about it when I left C++ programming.
And I refuse to be embarrassed for not knowing about a feature in
a programming language that is not my favorite before any compiler
even supports it.</p>
<p>But I am indeed excited for them! This is a substantial improvement
I have wanted since well before C++11 came out. They’ve done it
pretty close to how I wished for it here, and they have good
reasons for how they made it.</p>
</blockquote>
<p>There are a few weird parts of <code>this</code>.</p>
<p>For one, it is a pointer, but it is never allowed to be null, and it
cannot be modified to point to a different object. In both of these
ways, it behaves more like a reference than a pointer.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> bar() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">this</span> <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> Foo{}; <span style="color:#75715e">// Error
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span> Foo <span style="color:#f92672">*</span>foo <span style="color:#f92672">=</span> <span style="color:#66d9ef">nullptr</span>;
</span></span><span style="display:flex;"><span> foo<span style="color:#f92672">-></span>bar(); <span style="color:#75715e">// Undefined behavior
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>For another, when we want to put a modifier on <code>this</code>, like <code>const</code>
or <code>volatile</code>, there is nowhere obvious in the function signature to
put it. We have to put it awkwardly after the parameters, before the
<code>;</code> or <code>{</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> bar() <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">volatile</span> <span style="color:#f92672">&&</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Do stuff
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>Oddly enough, whether the parameter is taken by lvalue or rvalue can also be
specified, which would make way more sense for a reference parameter
instead of a pointer.</p>
<p>The modifiers have to go in this odd location because <code>this</code> is
implicit. This is in line with OOP ideology and theory, but in my mind,
it’s just a negative. If you have to think about whether it’s <code>const</code>
or taken by <code>rvalue</code> anyway when writing the signature, why put those
modifiers somewhere you might forget about, instead of right with
the declaration of the parameter.</p>
<p>I would change the syntax to fix both of these issues with one fell swoop:
allow an explicit <code>self</code> as an alternative to implicit <code>this</code>, and
make it a reference:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">public</span><span style="color:#f92672">:</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> bar(<span style="color:#f92672">&</span>self) {
</span></span><span style="display:flex;"><span> self.baz();
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">baz</span>(<span style="color:#66d9ef">volatile</span> <span style="color:#66d9ef">const</span> <span style="color:#f92672">&</span>self) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Do stuff
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The type would still be implicit, but modifiers can be specified where
the type would be. You would also only be able to take by reference or
rvalue reference, and never by value, because implicit copy on method
call would be a new feature of questionable value. It would not conflict
with existing code, as a parameter named <code>self</code> without an explicit type
would be illegal under the current syntax.</p>
<p>Of course, this looks rather similar to Rust’s syntax, but believe it or
not, I had this idea long before I learned that Rust does <code>self</code> in this
way.</p>
<h1 id="a-new-byte-type-for-uint8_t-and-int8_t">A new <code>byte</code> type for <code>uint8_t</code> and <code>int8_t</code></h1>
<p>In C++, the type we use for an individual byte of data, by definition,
is <code>char</code>. This is the definition of <code>char</code> in the standard, and while
the byte length (<code>CHAR_BIT</code>) doesn’t have to be 8 bits, other standard
provisions and practical considerations mean that on a modern platform,
it always is.</p>
<p>We might use <code>uint8_t</code> or <code>int8_t</code> for bytes in practical code, but
these are defined as <code>typedef</code>s to <code>unsigned char</code> and <code>signed char</code> –
I don’t know whether this is required by the standard but it is always
done in practice.</p>
<p>However, <code>char</code> is also the type we use for text data, so it is a type
with two different contrasting (perhaps even contradictory) sets of
semantics.</p>
<p>That leads to many odd results, including the fact that <code>char</code>
cannot represent all Unicode characters because it has to be 1 byte
long. But the one I want to focus on today is a bit weirder.
What does this code print?</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e"><cstdint></span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e">#include</span> <span style="color:#75715e"><iostream></span><span style="color:#75715e">
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">message_data</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint8_t</span> message_type;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint8_t</span> message_length;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">uint8_t</span> data[<span style="color:#ae81ff">1</span>];
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">print_message_hdr</span>(message_data <span style="color:#f92672">&</span>mesg) {
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> <span style="color:#e6db74">"Type: "</span> <span style="color:#f92672"><<</span> mesg.message_type <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> <span style="color:#e6db74">"Length: "</span> <span style="color:#f92672"><<</span> mesg.message_length <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">main</span>() {
</span></span><span style="display:flex;"><span> message_data data;
</span></span><span style="display:flex;"><span> data.message_type <span style="color:#f92672">=</span> <span style="color:#ae81ff">100</span>;
</span></span><span style="display:flex;"><span> data.message_length <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span> print_message_hdr(data);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Well, if you thought the numbers <code>100</code> and <code>0</code> would show up on the
output, you’d be wrong. <code>std::cout</code>’s <code>operator<<</code>’s <code>char</code> overloads
are triggered, and so these fields, clearly meant as integers,
are printed as text:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>[jim<span style="color:#960050;background-color:#1e0010">@</span>palatinate:<span style="color:#f92672">~</span>]<span style="color:#960050;background-color:#1e0010">$</span> c<span style="color:#f92672">++</span> <span style="color:#f92672">-</span>std<span style="color:#f92672">=</span>c<span style="color:#f92672">++</span><span style="color:#ae81ff">11</span> test.cpp
</span></span><span style="display:flex;"><span>[jim<span style="color:#960050;background-color:#1e0010">@</span>palatinate:<span style="color:#f92672">~</span>]<span style="color:#960050;background-color:#1e0010">$</span> .<span style="color:#f92672">/</span>a.out
</span></span><span style="display:flex;"><span>Type: d
</span></span><span style="display:flex;"><span>Length:
</span></span><span style="display:flex;"><span>[jim<span style="color:#960050;background-color:#1e0010">@</span>palatinate:<span style="color:#f92672">~</span>]<span style="color:#960050;background-color:#1e0010">$</span>
</span></span></code></pre></div><p>In order to get the integer print-outs we want, we have to override
this strange default behavior, perhaps by casting the values to
<code>uint16_t</code> before printing them:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">print_message_hdr</span>(message_data <span style="color:#f92672">&</span>mesg) {
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> <span style="color:#e6db74">"Type: "</span> <span style="color:#f92672"><<</span> <span style="color:#66d9ef">uint16_t</span>(mesg.message_type) <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> <span style="color:#e6db74">"Length: "</span> <span style="color:#f92672"><<</span> <span style="color:#66d9ef">uint16_t</span>(mesg.message_length) <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This results in a better output:</p>
<pre tabindex="0"><code>[jim@palatinate:~]$ c++ -std=c++11 test.cpp
[jim@palatinate:~]$ ./a.out
Type: 100
Length: 0
[jim@palatinate:~]$
</code></pre><p>So, how do we make this a little more ergonomic? We introduce a <code>byte</code>
type, that is similar to <code>char</code>, but overloads differently. Like any
other integer type, it defaults to <code>signed</code>, and then we add overloads
to <code>operator<<</code> and others to treat it like an integer, not like a
character. Switching between <code>byte</code> and <code>char</code> would be an implicit
cast, but for overloading purposes, they would be different types.</p>
<p><code>uint8_t</code> and <code>int8_t</code> could then be defined in terms of <code>byte</code>.</p>
<p>I do not know what backwards-compatibility implications it has, but
I do think the decision to make <code>char</code> mean byte as its primary
meaning instead of “character” was a particularly poor one, and anything
we can do to migrate away from it would be good.</p>
<blockquote>
<p><strong>Update:</strong> Someone drew my attention to <code>std::byte</code>. This one I was aware of,
but had not thought about here as I didn’t think it really solves the problem.
As it is, it is not an arithmetic type, and therefore cannot be used as
the underlying type of <code>uint8_t</code>, leaving the confusing behavior in place.</p>
</blockquote>
<h1 id="real-if-else-expression-syntax">Real if-else Expression Syntax</h1>
<p>Oftentimes, in C++, I find myself writing code like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">int32_t</span> error_code;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> (setting <span style="color:#f92672">==</span> Setting<span style="color:#f92672">::</span>Socket) {
</span></span><span style="display:flex;"><span> error_code <span style="color:#f92672">=</span> initialize_socket();
</span></span><span style="display:flex;"><span>} <span style="color:#66d9ef">else</span> { <span style="color:#75715e">// setting == Setting::Pipe
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> error_code <span style="color:#f92672">=</span> initialize_pipe();
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">if</span> (error_code <span style="color:#f92672"><</span> <span style="color:#ae81ff">0</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>This <code>error_code</code> variable is just one example. I often want to
have a variable get different values depending on which side of the
<code>if</code>-<code>else</code> statement it’s on, without having to declare the variable
without an initializer right ahead of it, and write two assignment
statements. Basically, I want <code>if</code>-<code>else</code> to be an expression.</p>
<p>Now, of course, C++ already has the ternary operator: <code>?:</code>. But it’s so
ugly and unreadable that no one uses it, for good reason. It’s hard
to remember what the precedence is, meaning if we want to be rigorous
and friendly to our readers we need to bracket with <code>(</code> and <code>)</code> even if
strictly unnecessary, and the result looks like garbage and is hard
to format in a way that’s remotely readable:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">int32_t</span> error_code <span style="color:#f92672">=</span> (setting <span style="color:#f92672">==</span> Setting<span style="color:#f92672">::</span>Socket
</span></span><span style="display:flex;"><span> <span style="color:#f92672">?</span> initialize_socket()
</span></span><span style="display:flex;"><span> <span style="color:#f92672">:</span> initialize_pipe()
</span></span><span style="display:flex;"><span>);
</span></span></code></pre></div><p>What do I want instead? I want <code>if</code>-<code>else</code> to have this role, to
be an expression, where it evaluates to the value of the end of
each block (with no semicolon, to make clear that it’s an expression
not a full statement):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">int32_t</span> error_code <span style="color:#f92672">=</span> <span style="color:#66d9ef">if</span> (setting <span style="color:#f92672">==</span> Setting<span style="color:#f92672">::</span>Socket) {
</span></span><span style="display:flex;"><span> initialize_socket()
</span></span><span style="display:flex;"><span>} <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> initialize_pipe()
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>This is way better than <code>?:</code>. The blocks can be multiple statements
long if necessary. You can add <code>if</code>-<code>else if</code>-<code>else</code> chaining. And,
most importantly, it can be formatted like any other <code>if</code>-<code>else</code>.</p>
<blockquote>
<p><strong>Update:</strong> Someone drew my attention to a lambda-invocation pattern that is,
in my mind, equally ugly to <code>?:</code>, and also leaves you without the ability
to return from the enclosing function within the block. This strikes me
as extremely hackish and not really an improvement, but I suppose that’s
where C++ is going. I am at a loss for why they didn’t just implement
GCC’s expression blocks, followed by <code>if</code> as expression. It’s clearly
much better in my mind.</p>
<p>I’ve seen the technique from time to time but I guess I figured it
was too hackish to mention. I didn’t realize it was getting officially
recommended in C++ Core Guidelines. I feel like when they were recommending
it, they should’ve simultaneously been trying to get more usable and obvious
features included in the programming language itself. Maybe they are,
and if so I wish them luck in that! Maybe C++30 will be a safe and usable
programming language, equivalent to Rust now.</p>
</blockquote>
<h1 id="variable-shadowing">Variable Shadowing</h1>
<p>On a related note, I want to have multiple variables with the same name
shadow, rather than resulting in an error message. I want the new variable
with the same name to simply hide the old variable, rather than giving me
a “conflicting declaration” error (or similar).</p>
<p>Why? Well, a lot of production code involves taking the same conceptual
thing, and migrating it through many types. Without shadowing, we have
to use awkward Hungarian notation.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">handle_data</span>(<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>data_v, size_t size) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">uint8_t</span> <span style="color:#f92672">*</span>data_ch <span style="color:#f92672">=</span> (<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>)data_v;
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span><span style="color:#66d9ef">uint8_t</span><span style="color:#f92672">></span> data{data_ch, data_ch <span style="color:#f92672">+</span> size};
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Actually do something with `data`
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>The new way would look like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">handle_data</span>(<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">void</span> <span style="color:#f92672">*</span>data, size_t size) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">const</span> <span style="color:#66d9ef">uint8_t</span> <span style="color:#f92672">*</span>data <span style="color:#f92672">=</span> (<span style="color:#66d9ef">const</span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>)data;
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span><span style="color:#66d9ef">uint8_t</span><span style="color:#f92672">></span> data{data, data <span style="color:#f92672">+</span> size};
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This also cuts down on how many variables are in scope at once.</p>
<p>This bugs people who are new to Rust sometimes, but it’s fairly easy
to learn, and C++ has asked people to learn much, much harder things.
Once learned, it is really useful, as the alternative is to use
Hungarian notation or equivalents. It also helps you use the right
value, as you won’t accidentally go back and use an old one, as it’s
shadowed.</p>
<h1 id="first-class-support-for-sum-types">First-Class Support for Sum Types</h1>
<p><code>std::variant</code> is awful. I know, because few people except die-hards
use it, and people use the Rust equivalent, <code>enum</code>s, all the time. The
weirdest thing about <code>std::variant</code> is that it supposes that all
of the variants hold exactly one value, and one variant per type
is sufficient. In reality, multiple variants might hold values of
the same type, and many variants don’t need a value – both of which
are possible but clumsy to express using <code>std::variant</code>’s semantics.</p>
<p>But C++11 already introduced <code>enum class</code> for more powerful <code>enum</code>s! Let’s
go all the way and add Rust-style values associated with it, for a
compiler-implemented tagged union. The implementation of <code>std::option</code>’s
fields would be so much simpler.</p>
<pre tabindex="0"><code>template <typename T>
enum class option {
None,
Some {
T value;
},
// OK, define some methods
}
</code></pre><p>This interacts with object lifetimes and constructors in a complicated
way, but if there were interest, I know it could be figured out.
If you don’t think this feature is necessary, I suspect you’ve spent
too long programming without it. Once you get used to this, it’s really
hard to go without.</p>
<h1 id="conclusion">Conclusion</h1>
<p>I am not going to do anything to try to make these things happen.
I’m sure I’m not the most popular in the C++ community after my
long write-ups of how Rust is so much better, and it’s not where
my primary interests lie anymore. But, if someone were to make these
features happen, it would make my life much easier, when for good reasons,
projects I’m working on require me to code in C++.</p>
In Defense of 'C/C++'https://www.thecodedmessage.com/posts/c-c++/2023-08-28T00:00:00+00:00One of the minor points I discussed in my response to Dr. Bjarne Stroustrup’s memory safety comments was the controversial, apparently deeply upsetting term C/C++. It is controversial and interesting enough that I decided to say a little more about it here.
A little background: Many people, especially outside the C and C++ communities (which, to be clear, don’t always like each other that much) use the term C/C++ to talk about the two programming languages together, as an informal short-hand for “C and C++” or “C or C++.<p>One of the minor points I discussed in my
<a href="https://www.thecodedmessage.com/posts/stroustrup-response/">response</a> to Dr. Bjarne Stroustrup’s <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2739r0.pdf">memory safety
comments</a>
was the controversial, apparently deeply upsetting term C/C++. It is
controversial and interesting enough that I decided to say a little more
about it here.</p>
<p>A little background: Many people, especially outside the C and C++
communities (which, to be clear, don’t always like each other that
much) use the term C/C++ to talk about the two programming languages
together, as an informal short-hand for “C and C++” or “C or C++.”
Within the <del>C/C++</del> C and C++ communities, it is widely hated.</p>
<p>And now for me to say the thing guaranteed to anger the most possible
people: I see both sides of this debate.</p>
<p>On the one hand, the term “C/C++” is especially jarring because C and C++
fans regularly engage in actual controversy (famously including Linus
Torvalds, of the C-based Linux kernel, <a href="http://harmful.cat-v.org/software/c++/linus">insulting
C++ and its programmers</a>). It
is frustrating to be a C++ programmer, to have strong opinions on what
it means to be a C++ programmer, to think that C programmers are making
a misguided decision, that using C over C++ is technologically backwards
and regressive, and hear people cavalierly implying that the programming
languages are the same. And likewise, of course, for the C programmer
who feels similarly about C++.</p>
<p>And continuously, both C and C++ programmers are exposed regularly to
people who mix up the programming languages when it is harmful. They see
bosses and hiring managers who expect you to transition back and forth
between them without any friction, and to enjoy them equally. They see
resources that promise to teach you “C/C++ skills,” and know that they
won’t teach how to use either the way that that language’s particular
community actually prefers. They see people using “C/C++” all the time
to talk about the languages in a way that only would make sense if they
were much more similar than they, in fact, are – or at the very least,
than a die-hard partisan of C or C++ would think they are.</p>
<p>And I do think this is understandable. After all, people don’t tend to
lump together other languages like this. Java and C# are probably equally
related (if not more related), and no one writes that they’re hiring a
“Java/C# programmer.” Why should C and C++ get treated this way?</p>
<p>But, on the other hand, C and C++ are actually extremely closely related
programming languages. I was writing something recently comparing
Rust features to C and C++ features, specifically Rust <code>enum</code>s to the
tagged union idiom which is used in … C and C++, in very similar ways.
I know all the reasons why as a C programmer and as a C++ programmer,
I’m not supposed to write C/C++, and still, I was tired of writing
“C and C++” over and over again to describe this particular thing that
those languages have in common.</p>
<p>It turns out the real problem isn’t the act of writing “C/C++” – it
turns out that just banning a problematic word doesn’t fix the real
problem here at all – if it has ever fixed any problems.
Some people do need to be told that C and C++ programming are different
programming languages, different communities and different skillsets, even
though they are still related skillsets and related programming languages.
But some people who don’t need to be told that still find themselves
needing a shorthand sometimes, and don’t feel the cultural need to
be over-accommodating in avoiding it.</p>
<p>Because when two things are similar – and stop me if this is confusing! –
there’s some ways in which they’re the same, and some ways in which
they’re different. Sometimes, it makes sense to lump them together, and
sometimes, it doesn’t. But yelling that people shouldn’t write “C/C++”
won’t magically help anyone understand this – especially since those
people are almost certainly not listening, and you’re preaching to
the choir.</p>
<p>In the case of Dr. Stroustrup, he was using the “faux pas” of the
NSA using “C/C++” to avoid having to actually address what they said
and defend C++. He brought this up in his criticism of the <a href="https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/0/CSI_SOFTWARE_MEMORY_SAFETY.PDF">NSA white
paper</a>:</p>
<blockquote>
<p>As is far too common, it lumps C and C++ into the single category C/C++,
ignoring 30+ years of progress.</p>
</blockquote>
<p>I said, among other things, that Dr. Stroustrup was being unnecessarily
exclusionary based on buzz-words:</p>
<blockquote>
<p>He’s reading too much into the orthography and the NSA’s failure to
use insider <em>shibboleths</em> of the programming languages they’re trying
to criticize. Outside of the “C” and “C++” communities, “C/C++” is a
fairly common way to refer to the two related programming languages.</p>
</blockquote>
<p>But also, he was calling them out when they were right. In the specific
category that the NSA was talking about, there actually is no difference,
as I also mention in my post:</p>
<blockquote>
<p>While there might be 30+ years of divergence between C and C++, none
of C++’s so-called “progress” involved removing memory-unsafe C features
from C++, many of which are still in common use, and many of which still
make memory safety in C++ near intractible.</p>
</blockquote>
<p>Perhaps we all should spend more time thinking critically than nit-picking
word choice. And perhaps I should find something better to do than
writing blog posts joining the fray, so that’s all I’ll say on the issue
for now.</p>
C++ Papercutshttps://www.thecodedmessage.com/posts/c++-papercuts/2023-08-26T00:00:00+00:00UPDATE: Wow, this post has gotten popular! I’ve written a new post that adds new papercuts combined with concrete suggestions for how C++ could improve, if you are interested. Also, if you want to read more about C++’s deeper-than-papercut issues, I recommend specifically my post on its move semantics. Thank you for reading!
My current day job is now again a C++ role. And so, I find myself again focusing in this blog post on the downsides of C++.<p><em>UPDATE: Wow, this post has gotten popular! I’ve written
a <a href="https://www.thecodedmessage.com/posts/c++-additions/">new post</a> that adds new papercuts combined
with concrete suggestions for how C++ could improve, if you are
interested. Also, if you want to read more about C++’s
deeper-than-papercut issues, I recommend specifically
my post on its <a href="https://www.thecodedmessage.com/posts/cpp-move/">move semantics</a>. Thank you for reading!</em></p>
<p>My current day job is now again a C++ role. And so, I find myself again
focusing in this blog post on the downsides of C++.</p>
<p>Overall, I have found returning to active C++ dev to be exactly what
I expected: I still have the skills, and can still be effective in
it, but now that I have worked in a more modern programming language
with less legacy cruft, the downsides of C++ sting more. There are so
many features I miss from Rust, not only the obvious safety features,
or even primarily those, but also features that C++ could easily add,
like first-class support for sum types (called <code>enum</code>s in Rust), or
tuples. (<strong>Clarification for C++ Fans:</strong> <code>std::tuple</code> and <code>std::variant</code>
are <em>not</em> first class support, and if you’re used to first class
support, you know how unacceptably clunky they are.)</p>
<p>In this blog post, I will focus on the minor problems of C++ that
have affected me the most, the little usability <em>papercuts</em>, the petty
inconveniences that just waste time. Instead of focusing on comparing
them to Rust or other programming languages, I will focus on why they
don’t make sense from a C++ point of view, with reference to just C++.
I know better than to hope that by doing this that die-hard C++ fans will
accept my criticism, but perhaps it will be relatable to C++ programmers
who don’t have Rust experience.</p>
<p>Before I start getting into the papercuts, though, I want to address
one of the primary defenses I’ve seen of C++, one that I’ve found
particularly baffling. It goes something like this:</p>
<blockquote>
<p>C++ is a great programming language. The complaints are just from
people who aren’t up to it. If they were better programmers,
they’d appreciate the C++ way of doing things, and they wouldn’t
need their hand held. Languages like Rust are not helpful for
such true professionals.</p>
</blockquote>
<p>Obviously, the phrasing is a bit of a parody, but I’ve seen this
sort of attitude so many times. The most charitable view I can take
of it is a claim that C++’s difficulty is a sign of its power, and
the natural cost of using a powerful programming language. What it
reads like to me in many cases, however, is as a form of elitism:
a general idea that making things easy for poorer programmers is
pointless, and that good programmers don’t benefit from making
things easier.</p>
<p>As someone who has programmed C++ professionally for a majority of my
career, and who has taught (company-internal) classes in advanced C++,
this is nonsense to me. I do know how to navigate the many papercuts
and foot-guns of C++, and am happy to do so when working on a C++
codebase. But experienced as I am, they still slow me down and
distract me, taking focus away from the actual problems I’m trying
to solve, and resulting in less maintainable code.</p>
<p>And as for the upside, I see very little that C++ gets in exchange for
all of this difficulty. The only ways in which C++ is more performant
or more appropriate than Rust are in terms of platform support, legacy
codebases, optimizations that are only available in specific compilers
that happen to not support Rust, or other concerns irrelevant to the
actual design of the programming language.</p>
<p>While I am proud of my C++ skills, I am not too proud to appreciate
that better technology can render them partially obsolete. I am not
too proud to appreciate having features that make it easier. In most
cases, it’s not a matter of the programming language doing more work
for me, but of C++ creating unnecessary extra make-work, often
due to decisions that made sense when they were made, but have long
since stopped making sense – don’t get me started on header files!</p>
<p>But I also want my programming language to be beginner-friendly. I am
always going to work with other programmers with a variety of skill-sets,
and I would rather not have to clean up my colleagues’ mistakes –
or mistakes of earlier, more foolish versions of myself. If making a
programming language more beginner-friendly sacrifices power, then I
agree that some programming languages should not do it. But many, even
most of C++’s beginner-unfriendly (and expert-annoying) features do not
in fact make the language more powerful.</p>
<p>So, without further ado, here are the biggest papercuts I’ve noticed
in the past month of returning to C++ development.</p>
<h1 id="const-is-not-the-default"><code>const</code> is not the default</h1>
<p>It is very easy to forget to mark a parameter <code>const</code> when it can be.
You can just forget to type the keyword. This is especially true for
<code>this</code>, which is an implicit parameter: there is no time when you are
typing out the <code>this</code> parameter explicitly, and therefore it won’t sit
there looking funny without the appropriate modifiers.</p>
<p>If C++ had the opposite default, where every value, reference, and pointer
was <code>const</code> unless explicitly declared mutable, then we’d be much more
likely to have every parameter declared correctly based on whether the
function needs to mutate it or not. If someone includes a <code>mutable</code> keyword,
it would be because they know they need it. If they need it and forget it,
the compiler error would remind them.</p>
<p>Now, you might not think this is important, because you can just not
use <code>const</code> and have functions with capabilities they don’t need –
but sometimes you have to take things by <code>const</code> in C++. If you take a
parameter by non-<code>const</code> reference, the caller can only use lvalues to
call your function. But if you take a parameter by <code>const</code> reference,
the caller can use lvalues or rvalues. So some functions, in order to
be used in natural ways, must take their parameters by <code>const</code> reference.</p>
<p>Once you have a <code>const</code> reference, you can only (easily) call functions
with it that accept <code>const</code> references, and so if any of those functions
forgot to declare the parameter <code>const</code>, you have to include a <code>const_cast</code>
– or go change the function later to correctly accept <code>const</code>.</p>
<p>Lest you think this is just a sloppy newbie error, note that
many functions in the standard library had to be updated to take
<code>const_iterator</code> instead of or in addition to <code>iterator</code> when it was
discovered correctly that they made sense with a <code>const_iterator</code>:
functions like <code>erase</code>. It turns out that for functions like <code>erase</code>,
the collection is what has to be mutable, not the iterator – a fact
that the maintainers of the C++ library simply got wrong at first.</p>
<h1 id="obligatory-copying">Obligatory Copying</h1>
<p>In C++, for an object to be copyable is the default, privileged way
for an object to behave. If you don’t want your object to be copyable,
and all its fields are copyable, you often have to mark the copy constructor
and copy assignment operator as <code>= delete</code>. The default is for the compiler
to write code for you – code that can be incorrect.</p>
<p>If you do make your class move-only, however, beware, because
that means that there are situations where you can’t use
it. In C++11, there was no ergonomic way to do a lambda capture
by move – which is usually how I want to capture variables
into a closure. This was “fixed” in C++14 – for when you want what
should have been the default from the beginning, you can now use
extremely clunky move-capture syntax.</p>
<p>However, even then, good luck using the lambda. If you
want to put it in a <code>std::function</code>, you’re still out
of luck to this day. <code>std::function</code> expects the object
it manages to be copyable, and will fail to compile if your
closure object is move-only. This is going to be addressed in C++23, with
<a href="https://en.cppreference.com/w/cpp/utility/functional/move_only_function/move_only_function"><code>std::move_only_function</code></a>
– but in the meantime, I have been forced to write classes with a copy
constructor that throws some sort of run-time logic exception. And even
in C++23, copyable functions will be the default, assumed situation.</p>
<p>This is strange, because most complicated objects, especially closures,
are never, and should never be, copied. Generally, copying a complicated
data structure is a mistake – a missing <code>&</code>, or a missing <code>std::move</code>.
But it is a mistake that carries no warning with it, and no visible
sign in the code that a complex, allocation-heavy action is being
undertaken. This is an early lesson to new C++ devs – don’t pass
non-primitive types by value – but it’s possible for even advanced devs
to mess up from time to time, and once it’s in the codebase, it’s easy
to miss.</p>
<h1 id="by-reference-parameter-papercuts">By-Reference Parameter Papercuts</h1>
<p>It is unergonomic to return multiple values by tuple in C++. It can be
done, but the calls to <code>std::tie</code> and <code>std::make_tuple</code> are long-winded
and distracting, not to mention that you’ll be writing unidiomatically,
which is always bad for people who are reading and debugging your code.</p>
<blockquote>
<p><strong>Side note:</strong>
Someone brought up structured bindings in a comment, as if this fixed
the issue. Structured bindings are a great example of the half-way fixes
that proponents of modern C++ love to cite. Structured bindings help
some, but if you think they make returning by tuple ergonomic, you’re
mistaken. You still need to either write <code>std::pair</code> or
<code>std::make_tuple</code> in the function return statement, or <code>std::tuple</code>
in the function’s return type. This isn’t the worst, but it’s
still not as light-weight as full first-class tuple support, and
it’s not enough to have convinced people to not use out parameters,
which are my real complaint.</p>
<p>And even at that, it’s not that out parameters (or in-out parameters)
are bad, but that they’re bad in C++, as there is no good way to
express them.</p>
</blockquote>
<p>So what do we do instead? The clunkiness of tuples leads people to
instead use out parameters. To use an out parameter, you end up taking
a parameter by non-<code>const</code> reference, meaning the function is supposed
to modify the parameter.</p>
<p>The problem is, this is only marked in the function signature. If you
have a function that takes a parameter by reference, the parameter looks
the same as a by-value parameter at the call site:</p>
<pre tabindex="0"><code>// Return false on failure. Modify size with actual message size,
// decreasing it if it contains more than one message.
bool got_message(const char *mesg, size_t &size);
size_t size = buff.size();
got_message(buff.data(), size);
buff.resize(size);
</code></pre><p>If you’re reading the calling code quickly, it might look like the
<code>resize</code> call is redundant, but it is not. <code>size</code> is being modified by
<code>got_message</code>, and the only way to know that it is being modified is to
look at the function signature, which is usually in another file.</p>
<p>Some people prefer out parameters and in-out parameters to be passed
by pointer for this very reason:</p>
<pre tabindex="0"><code>bool got_message(const char *mesg, size_t *size);
size_t size = buff.size();
got_message(buff.data(), &size);
buff.resize(size);
</code></pre><p>This is great – or would be, if pointers weren’t nullable. What does
a <code>nullptr</code> parameter mean in this context? Is it going to trigger
undefined behavior? What if you pass a pointer from a caller into it?
People often forget to document what functions do with a null pointer.</p>
<p>This can be addressed with a non-nullable smart pointer, but very
few programmers actually do this in practice. When something isn’t
the default, it tends to not be used everywhere where appropriate.
The sustainable answer to this is changing the default, not heroic
attempts to fight human nature.</p>
<p>Obligatory side-gripe:
At least in non-owning situations like this, it is possible to write
such a smart pointer. However, if you want to write the obvious companion,
a non-nullable owning smart pointer, a companion version of <code>std::unique_ptr</code>,
then it cannot be done in a useful way, because such a pointer cannot
then <a href="https://www.thecodedmessage.com/posts/cpp-move/">be moveable</a>.</p>
<h1 id="method-implementations-can-contradict">Method Implementations Can Contradict</h1>
<p>In C++, every time you write a class, especially a lower-level one,
you have a responsibility to make decisions about certain methods with
special semantic importance in the programming language:</p>
<ul>
<li><strong>Constructor (Copy):</strong> <code>X(const X&)</code></li>
<li><strong>Constructor (Move):</strong> <code>X(X&&)</code></li>
<li><strong>Assignment (Copy):</strong> <code>operator=(const X&)</code></li>
<li><strong>Assignment (Move):</strong> <code>operator=(X&&)</code></li>
<li><strong>Destructor:</strong> <code>~X()</code></li>
</ul>
<p>For many classes, the default implementations are enough, and if
possible you should rely on them. Whether or not this is possible
depends on whether naively copying all of the fields is a sensible
way to copy the entire object, which is surprisingly easy to forget
to consider.</p>
<p>But if you need a custom implementation of one of these, you are on
the hook to write all of them. This is known as the “rule of 5.”
You have to write all of them, even though the correct behavior of the
two assignment operators can be completely determined by the
appropriate constructor combined with the destructor. The compiler could
make default implementations of the assignment operators that refer to
those other functions, and therefore would always be correct, but it
does not. Implementing them correctly is tricky, requiring techniques
like either explicitly protecting against self-assignment, or swapping
with a by-value parameter. In any case, they are boilerplate, and yet
another thing that can go wrong in a programming language that has many
such things.</p>
<blockquote>
<p><strong>Side note:</strong>
One commentator did not understand what I meant. It is true
that many classes can use <code>= default</code> for all these methods.
However, IF you customize the copy constructor or move constructor,
you must THEN also customize the assignment operator to match,
even though the default implementation could have been correct,
if the language was defined more intelligently.</p>
<p>I thought this was clear by citing the rule of 5, which essentially says this.</p>
<p>The full rule is explained on <a href="https://en.cppreference.com/w/cpp/language/rule_of_three">CPP Reference</a>.
If you customize the copy or move constructor, the corresponding <code>= default</code>
assignment operator will be wrong. Be careful! Note how the
example code does not use <code>= default</code> for the assignment operators,
even though the assignment operators contain no logic.</p>
</blockquote>
<h1 id="modern-c">“Modern” C++</h1>
<p>After seeing comments on Hacker News, I felt compelled to add this
section. Every time someone complains about anything in C++, someone will
mention a newer version of C++ that fixes it. These “fixes” are usually
not that good, and only feel like fixes if you’re used to everything
being kind of clunky.</p>
<p>Here’s why:</p>
<ul>
<li>The default way still is the old, bad way. For example, capturing
lambdas by move should be the default, and <code>std::move_only_function</code>,
coming soon in C++23, should have been the default <code>std::function</code>.</li>
<li>For that reason, and because there’s never warnings enabled on the old,
bad way, even new coders keep doing things the bad way.</li>
</ul>
<p>Of course, I understand that this is important for backwards-compatibility.
But that is the entire problem: C++ has too many bad decisions accumulated.
Why was copying the default for parameter passing collections, let alone
for lambda capture? I know the historical reasons, but that doesn’t
mean that a modern programming language should work that way.</p>
<p>Even C++11 couldn’t clean up the fact that raw pointers and
C-style arrays get nice syntax, while smart pointers and <code>std::array</code>
look terrible. Even C++11 couldn’t clean up that it was working
around a language designed without moves.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Unfortunately, I am all too well aware of why these decisions were
made, and it is exactly one reason: Compatibility with legacy code.
C++ has no editions system, no way to deprecate core language
features. If a new edition of C++ was made, it would <a href="https://github.com/carbon-language/carbon-lang">cease to be
C++</a> – though I
support the efforts of people to transition C++ to new syntax
and clean some of this stuff up.</p>
<p>However, if you ignore backwards-compatibility and the large existing
codebases, none of these papercuts make the programming language more
powerful or better, just harder to use. I’ve seen good-faith arguments
in favor of human-maintained header files, surprising as that is to me, but I
challenge my readers to tell me what is beneficial about C++’s design
choices in these matters.</p>
<p>You might find these things trivial, but these all slow programmers
down, while simultaneously annoying them. If you are experienced enough,
your subconscious might be adept at navigating it, but imagine what your
subconscious could do if it didn’t have to. But how adept are you at seeing
these mistakes in a code review from your junior colleagues? If you are a
rigorous reviewer, how much more time does it take? How adept are you at
finding these issues quickly when a bug arises?</p>
<p>We’d be more effective, more efficient, and happier if these issues were
resolved. Programming would be both enjoyable and faster to do.
What’s the downside? The only upside is continuity with history.
And while I can see the value in that, it is a very limited value,
with very limited scope.</p>
New Link: Technical Only RSShttps://www.thecodedmessage.com/posts/technical-only-rss/2023-08-06T00:00:00+00:00TLDR: I am adding a new link for RSS subscribers who just want to subscribe to technical posts. The RSS feed has always been available, but it is now explicitly one of the links across the top, for those who want their RSS feed to only give them my new technical posts.
I am writing this post primarily to let people know about this new link, but I also want to muse on it a little.<p>TLDR: I am adding a new link for RSS subscribers who just want
to subscribe to technical posts. The RSS feed has always been available,
but it is now explicitly one of the links across the top, for those
who want their RSS feed to only give them my new technical posts.</p>
<p>I am writing this post primarily to let people know about this new
link, but I also want to muse on it a little.</p>
<p>I realize that I have, in some ways, two blogs here in one website.</p>
<p><a href="https://www.thecodedmessage.com/">The Coded Message</a> is primarily read for its <a href="https://www.thecodedmessage.com/tags/computers/">technical
content</a>, especially for the <a href="https://www.thecodedmessage.com/tags/rust/">posts
about Rust</a>. But I also write about <a href="https://www.thecodedmessage.com/tags/nontechnical/">other
topics</a> that interest me, and those posts are
generally much less popular.</p>
<p>I combine them on the same website for a few reasons.</p>
<p>For one, it’s easier for me to have one blog. Blogging is a hobby for
me, and so it has to play second fiddle to other life obligations,
which is most of why I’ve been slow to finish some blog series and some
promised future posts – I have not forgotten. This also means that
anything that would make blogging harder for me, including separating
out these blogs into two fully separate websites, is likely to make me
blog substantially less. Laziness might not always be a virtue,
whatever Larry Wall <a href="https://thethreevirtues.com/">might say</a>, but
some amount of it is essential to actually accomplishing goals, especially
in the hobby space.</p>
<p>But there is also a reason besides laziness, that is a little harder to
articulate. As much as this blog largely concerns my professional work,
it is my personal blog. All of the programming posts are laden with my
personal opinions about programming, and this website is about everything
I personally have to say publicly on any topic, not just programming.
A separation between my professional and personal blogs would lead,
in my own mind, to a sense of obligation to make the professional blog
a polished resource for programmers, with more organization and possibly
even a regular schedule, as opposed to merely being a forum where I hold
forth on whatever topics interest me, which often but not always happens
to be programming.</p>
<p>That said, I do make all my posts in the hopes that people read them,
and find them useful in some way (even if that use is, as for my
<a href="https://www.thecodedmessage.com/tags/fiction/">fiction</a> posts, primarily entertainment). And I am aware
that a large portion of my readership primarily, or even exclusively,
finds my technical posts useful. As much as I may wish that all of my
readers who are here for Rust content also care about my <a href="https://www.thecodedmessage.com/tags/nontechnical/">musings on
other topics</a>, I know that many of them do not,
or even seriously disagree with me on these topics.</p>
<p>I try already to accommodate this. If you sign up for
my newsletter, by default, you are only subscribed to
technical posts, and you have to follow an additional link
and explicitly subscribe if you want other topics. If you go to
<a href="https://www.thecodedmessage.com/">www.thecodedmessage.com</a> in your web
browser, you can click the link at the top labelled <a href="https://www.thecodedmessage.com/tags/computers/">Computers/Programming
Posts</a>. And now, if you want to subscribe to just the
technical posts via RSS, there is also a link at the top for that purpose.</p>
<p>I still encourage people who are interested in my other posts to read
them, and I still plan on having this website combined for at least
the medium-term future, but I wanted people to know that a technical-posts
only RSS feed was available, if they so chose.</p>
<p>As always, I welcome feedback on my blog in the form of comments and
e-mails (jah259 at cornell dot edu). Thank you so much for reading!</p>
The Curse of Coffeehttps://www.thecodedmessage.com/posts/coffeeless/2023-07-08T00:00:00+00:00TRIBUNAL PROCEEDING TRANSCRIPT
SUB LEGIBUS ORDINIS SACROSANCTI IMMORTALIUM
PROVISIONAL PROOF TEXT
IN THE CASE OF:
ŌRDŌ SACROSANCTUS VERSUS THE NAMELESS DAUGHTER OF MUŠMAḪḪU THE SEVEN-HEADED SERPENT, SHE WHO IS KNOWN TO THE MORTALS AS EUNICE
LORD JUSTICE MEPHISTO, PRESIDING
LORD JUSTICE DRACHENMILCH, LORD JUSTICE BA’AL-HA-KHUMUS, AND LORD LADY JUSTICE XYXXYZ
MR. AZAXAZALIA, ESQ., PROSECUTOR
MS. “EUNICE”, DEFENDANT
A RECORD OF EUNICE‘S TESTIMONY
TRANSCRIBED BY GEORGE SMITH, HUMAN, JUNIOR APPRENTICE CLERK<blockquote>
<p><strong>TRIBUNAL PROCEEDING TRANSCRIPT</strong><br>
SUB LEGIBUS ORDINIS SACROSANCTI IMMORTALIUM<br>
PROVISIONAL PROOF TEXT</p>
<p>IN THE CASE OF:<br>
<em><strong>ŌRDŌ SACROSANCTUS</strong> VERSUS THE NAMELESS DAUGHTER OF MUŠMAḪḪU
THE SEVEN-HEADED SERPENT,
SHE WHO IS KNOWN TO THE MORTALS AS <strong>EUNICE</strong></em></p>
<p>LORD JUSTICE <strong>MEPHISTO</strong>, PRESIDING<br>
LORD JUSTICE <strong>DRACHENMILCH</strong>, LORD JUSTICE <strong>BA’AL-HA-KHUMUS</strong>, AND
<del>LORD</del> LADY JUSTICE <strong>XYXXYZ</strong></p>
<p>MR. <strong>AZAXAZALIA</strong>, ESQ., PROSECUTOR<br>
MS. “<strong>EUNICE</strong>”, DEFENDANT</p>
<p>A RECORD OF <strong>EUNICE</strong>‘S TESTIMONY<br>
TRANSCRIBED BY <strong>GEORGE SMITH</strong>, HUMAN, JUNIOR APPRENTICE CLERK<br>
COURTROOM 31B, NO OTHERS IN ATTENDANCE</p>
</blockquote>
<p><strong>EUNICE, DEFENDANT:</strong> My lady. My lords.</p>
<p>I have been called upon by this most ancient, most esteemed, most noble
tribunal to give a reckoning of my behavior. You have already heard
the prosecutor’s speech, and now it is time for me to defend myself. And
defend myself I shall, with pleasure. To be frank, the story – the entire
story, without the prosecutor’s dishonest gaps and distortions – speaks
for itself. So, rather than try to wrangle a creative interpretation of
some of the more arcane and ancient laws of our <em>Ōrdō</em>, as many before
you have done, including this sly prosecutor, I will simply tell the
whole truth of what happened, and you shall see that, far from being
criminal, it upholds – nay, epitomizes – all of our finest traditions.</p>
<p>Let me set the scene for you.</p>
<blockquote>
<p>LET THE RECORD STATE THAT AT THIS POINT THE DEFENDANT BEGINS TO SPEAK
IN A LOUD, INTENTIONALLY MUFFLED VOICE –<strong>GEORGE</strong>, CLERK</p>
</blockquote>
<p><em>This is a reminder that all F trains are running on the D line and all
D trains are running on the F line from Broadway Lafayette throughout
Brooklyn. This is, to repeat, an F train running on the D line from
Broadway Lafayette throughout Brooklyn. Thank you for riding MTA New –</em></p>
<p><strong>LORD JUSTICE BA’AL-HA-KHUMUS:</strong> The defendant is reminded to stick to
the facts, the legal facts. This is a tribunal, and we are interested in
the facts and the law, not your acting or story-telling skills.</p>
<p><strong>EUNICE, DEFENDANT:</strong> Of course, my lord. I’m sorry.</p>
<blockquote>
<p>LET THE RECORD STATE THAT THE DEFENDANT DOES NOT SEEM AT ALL SORRY.
–<strong>GEORGE</strong>, CLERK</p>
</blockquote>
<p>Kevin had no excuse. This is not my opinion, but a legal fact. He wasn’t
late for work – he wasn’t even going to work. He wasn’t even late for
brunch: that wasn’t for another two hours, and Peter, the friend he was
meeting up with, was always late. Even if he just completely went to the
wrong place, and had to walk across Brooklyn, he still would have had
time to go home – he had spent the night with a girl he was seeing –
to not just go home, but also to shower and change, and still make it
to brunch on time.</p>
<p>The walk might’ve even done him some good, not in terms of exercise –
exercise he got plenty of, efficiently and perfunctorily, at his office gym
– but in terms of fresh air and a change of pace and scenery. Maybe he
would’ve been able to relax and enjoy life some, to slow down and calm
down. Maybe, just maybe, he would even have been able to avoid his fate.</p>
<p>But in spite of these low stakes, when he heard this announcement,
when he was reminded of this deeply absurd but ultimately trivial
inconvenience, he treated it like an emergency. Carelessly, recklessly,
he leapt up from his seat, too quickly to pay attention to where he was
going, but somehow still slowly enough that his Apple AirPods™ (Pro,
3rd generation) were at no risk of falling out of his ears.</p>
<p>Have I been unfair towards Kevin? I concede, I really do, that it is not
a sin – not a sin <em>per se</em> – to be anxious and rushed. And sometimes I
do question how harshly I judged him for the expensive headphones. I knew
he could well afford them – I knew everything about him just by looking
at him, and when you reach my age you know far more than most people even
think possible. And besides, everyone has the right to listen to music,
the right to ignore strangers on the train who are trying to talk to
you, even innocent, completely harmless old women who are really just
trying to explain to you that the lid on your Starbucks coffee cup is
slightly askew, that – drip, drip, drip – you’re leaking all over the
(admittedly already quite unsanitary) subway seat.</p>
<p>But given what happened next, I think you’ll forgive me all this
judgmentalism. I certainly do not regret any of it.</p>
<blockquote>
<p>LET THE RECORD STATE THAT AT THIS POINT THE DEFENDANT BEGINS TO VIOLENTLY
WAG HER FINGER, AND HER VOICE BECOMES SOMEWHAT CREAKY –<strong>GEORGE</strong>, CLERK</p>
</blockquote>
<p>I regret many things in life. I do rue and lament many of the paths
I’ve walked – and many more that I walked right past. Anyone as old
as myself who claims to have no regrets is fooling themselves or lying,
probably both.</p>
<p>But concerning Kevin I regret absolutely nothing.</p>
<p>Because if you are as careless, as reckless, and as rushed as Kevin –
and again, with no urgency or occasion to justify it – and if you pay
no attention to your surroundings, not even to rudimentary courtesy, not
to mention basic safety, and you cause a poor old woman, who was at the
time using her cane to ever so slowly hoist herself out of her seat, to
not only fall face-first to the floor but to then find herself drenched
and soaked in your still quite hot coffee, well, the least you could
do – the very least – would be to resign yourself to a slightly more
inconvenient trip, to accept a slightly more complicated day, and check
(for more than a split second) whether she’s OK – maybe even help her
back up onto her feet.</p>
<p>But nothing like that from Kevin. Just a split second’s glance, just
enough that anyone could see that he knew what he had done, and no more.
A glimpse, and then out the door he went. Other passengers helped me to
my feet and offered me a handkerchief (from the older Italian gentleman
from the Bronx in a suit) and napkins (from the young generically-white
lesbian transplant from Minnesota) to clean myself. They did this out
of common courtesy. They did this, for me, even though I was old and
strange, even though they had all just heard me scream, like a crazy
person, from the bottom of my gut through the top of my lungs, “MAY
YOU NEVER DRINK COFFEE AGAIN! MAY IT NEVER EVEN TOUCH YOUR LIPS! MAY IT
ENTICE AND ALSO STYMIE YOU!”</p>
<blockquote>
<p>LET THE RECORD STATE THAT THE CAPITAL LETTERS ABOVE INDICATE THAT THE
DEFENDANT IS THROWING HER HEAD BACK, CLOSING HER EYES, AND SCREAMING IN
THE WITNESS STAND. THE SOUND EMERGING FROM HER INHUMANLY DISTENDED MOUTH
IS THAT OF TEN WOMEN SCREAMING THE SAME WORDS SLIGHTLY OUT OF SYNC WITH
EACH OTHER. THE SOUND IS ECHOING BEYOND WHAT MAKES SENSE FOR THE ACOUSTICS
OF THIS ROOM. THE STONE WALLS ARE SHAKING. AND YET, THE JUSTICES ARE
SITTING CALMLY, AS IF THIS WERE A COMPLETELY NORMAL OCCURRENCE. –<strong>GEORGE</strong>,
CLERK</p>
</blockquote>
<p>And to be clear, Kevin knew that what he did was wrong. He even was
able to hear the screams of the curse, and thought it was fair – fair
at least that the old woman was angry, angry enough that he could hear
her, that is my, malediction through the closed train doors and over the
sound of the departing train. He even felt some measure of repentance –
or at least a mental act somewhat cognate to repentance, a Kevin version
of it, if you will. The actual words from his mouth were, to be exact,
“fucking trains,” but behind those words, there was an inkling of a
glimmer of a feeling of a spark of responsibility.</p>
<p>But more than guilt, more than any moral regret, Kevin regretted that
he had no coffee left. He threw the now-empty paper cup in an already
overflowing trash bin on the platform, and pretended not to notice as
it bounced out and fell to the floor. On the next train not only
did he not get a seat, but he had nothing to even sip on.</p>
<p>There was a coffee shop near Kevin’s stop. It was a small, independent
place, with black walls and mismatched wooden tables, with knick-knacks
for sale and a disproportionate number of vegan options on the menu. Kevin
didn’t like the vibe; he preferred the standardization of Starbucks like
the one he’d gone to earlier that morning. It was two blocks out of the way,
but the extra time would be more than made up for by getting someone
else to make coffee rather than grind them and prepare them at home –
which always took way longer than he thought it would.</p>
<p>As he approached it, there was something off about it. He didn’t
fully process it at first, but the relative dimness of the storefront,
and the general lack of energy, dampened his mood before he solidified
and verbalized his thoughts, and then had them finally consolidated
and confirmed when he arrived at the door.</p>
<p>It was closed. On a Sunday, somehow. A sign out front said that there was
some sort of an important family matter, and gave no further detail. Kevin
sighed, feeling disappointment combine with his pre-existing grogginess
and need for caffeine – and the connected need for the taste of coffee
and the feel of a drink in his hand.</p>
<p>No matter, there was still a Starbucks. After another block or three of
groggy, belabored, un-New Yorker-like slogging, he found it. The line
was a little long, but the end was in sight.</p>
<p>But the line didn’t move. And then, it didn’t move. And then, it still
didn’t move. The baristas poked at a touch screen and occasionally
muttered something about patience or just needing another minute, until
finally one of them (his name tag announcing him as Jason) decided to
simply call it, and told everyone in the line that their POS system
was broken. There was no alternative, so perhaps they should just
disperse. And so, coffeeless, Kevin finally went home.</p>
<p>Of course, as any coffee addict would do in such a situation, which is to
say any office worker in the entire five boroughs, he went to make himself
coffee as soon as he got home. Unlike many office workers, he defined
“as soon as” a little over-strictly – he did this even before feeding
his rightfully and righteously angry, hungry cat, who had done nothing
to deserve this neglect besides being nice to Kevin in the pet shelter.</p>
<p>But I regress! My lords – and lady – none of you are as old as I am,
so you perhaps have not yet gained as rigorous a Sight as I have. I can
see what happened next as though I was actually there – and of course,
in a sense, I was there, through the words of my curse. It brings a
smile to my lips even now to remember, not only seeing but even hearing,
Kevin rushedly shoveling the beans into his coffee grinder, extra
ones clattering on the floor and bouncing in all directions through his
kitchen, Mittens the cat running after them, thinking they were perhaps a form
of long-awaited food, and finally, once it was vaguely close to full,
him pushing the button and hearing, rather than the normal churning
noise, a mere, half-hearted “whirr-click.”</p>
<p>Not at all the correct sound, you could tell from his face. Undeterred,
he pressed the button on the grinder again. “Whirr-click,
whirr-click-SNAP!”</p>
<blockquote>
<p>LET THE RECORD STATE THAT AT THE WORD “SNAP” THE WITNESS STAND
AND THE DEFENDANT ARE LITERALLY STRUCK BY LIGHTNING. HOW CAN THIS
HAPPEN INDOORS? HOW DID IT NOT CATCH FIRE? HOW DO THE JUSTICES
REMAIN SO IMPASSIVE, EVEN BORED-LOOKING? I AM NOT PAID ENOUGH FOR
THIS. –<strong>GEORGE</strong>, CLERK</p>
</blockquote>
<p>The plastic casing of the grinder cracked, then broke. Streams of beans
flew to the ground. Mittens hissed and zoomed away into the bedroom
closet.</p>
<p>Kevin slowly slid down to the floor, and put his face in his hands. The
sound of his curse was echoing in his ears. He wasn’t a superstitious
man. To the contrary, he had registered himself on the website of an
atheistic, anti-supernaturalist movement called “Brights” and therefore
listed “Bright” under his “Religious Views” on Facebook. He was utterly
committed to the Rationalist cause. He even dabbled in Effective Altruism;
the way to truly make the world a better place and help humanity, he
thought, was to ensure AI alignment. In any case, this is all to say,
he didn’t believe in anything as irrational and superstitious as curses,
and he certainly wasn’t about to start now.</p>
<p>After all, and I think this is quite essential to take note of,
everything that had happened is the sort of thing that just happens
sometimes. Family businesses have family emergencies sometimes. Tills
break down sometimes, POS systems even more often than sometimes. And
even the particular way that that coffee grinder broke, believe it or
not, had happened to a full 1.3% of purchasers, particularly those who,
like Kevin, weren’t good about cleaning it properly. Things do break
sometimes.</p>
<p>And by the way, my esteemed lady and lords, I invite you to investigate if
you don’t believe me. I have nothing to hide.</p>
<p>And yet, Kevin cried, feeling himself slip in his faith in no faith,
his face twitching in the manner that in my several centuries of life has
always indicated dogmatic struggle and religious doubt. Sure, things go
wrong sometimes, he thought, but a person with money in a first world
city and time to spare will eventually be able to purchase coffee,
whatever weird old ladies on the train might say.</p>
<p>Ultimately, a gear clicked into place, and he returned to some semblance
of spiritual stasis. This must be some sort of statistical effect, he
concluded. Most days, after all, don’t have coincidences, but coincidences
do happen sometimes. Sometimes, even coincidences involving very odd
words from very odd old ladies. And perhaps something about his behavior
is being influenced by her, subconsciously making him go places where the
coffee is not available … but then again, how does that make sense?</p>
<p>In any case, he would proceed according to his beliefs. He believed thus:
Coffee can be bought, and curses weren’t real, old ladies or no old
ladies. Maybe not at any individual place, but in New York City, with
enough money and time, a man can eventually drink coffee. To solidify
these beliefs further in his mind, he nodded furiously, as if agreeing
with himself. Then, he got out his phone, and calmly – he certainly
kept on telling himself he was calm – ordered a new coffee grinder for
delivery off of Amazon.</p>
<p>Kevin then splashed some water on his face, grabbed his bag, and walked
out the door. It is unclear whether he heard, as he was leaving, the
muted resumption of meowing from Mittens, who, of course, remained unfed.</p>
<p>I will now jump ahead to the actual brunch with Kevin’s friend Peter, the
incident that the prosecutor focused on. I will skip over the incident
of the Starbucks barista being fired for not only breaking the till but
leading the customers on with the promise of coffee – my supplementary
submission shows clearly that she was about to be fired for other reasons,
nevertheless. I will also skip over the package thief who would later
be struck by a car crossing the street after stealing the replacement
coffee grinder from Kevin’s stoop. He was going to steal packages anyway,
my curse merely redirected him to Kevin’s house. His death was in any case
a result of his own decision to jaywalk, and his reincarnation as a raccoon
as punishment for package thievery seems to me completely justified.
In any case, the prosecutor failed to articulate a valid legal claim for
those incidents, and as I said, I will skip past them, and let my lady and
lords of the tribunal read about them in my brief.</p>
<p>My lady and lords, as we all know, Kevin did eventually meet his friend
Peter for brunch, albeit half an hour late. Even Peter, who was never
on time for anything, was already seated when Kevin arrived, at a small
table on the restaurant’s cozy rear patio, already sipping his bloody
maria, Peter’s new favorite drink, with tequila instead of vodka.
And there beside the novel brunch cocktail, in a small mug, not yet
touched, there sat hot, steaming, freshly poured black coffee.</p>
<p>Kevin was halfway through greeting Peter, “Hey, good to s–,” when he
saw the coffee. Kevin wasn’t a very emotional man, and he certainly
wasn’t easily moved, so he wasn’t all that familiar with the feeling he
felt upon seeing it. His limbs tingled and he involuntarily sharply
gasped for air, and his shoulders and then his legs shook in an all-body
shudder. Peter was looking at his phone and didn’t notice.</p>
<p>Kevin stood there speechless for a few seconds as Peter continued to
fiddle with something on his phone. “Hey, yeah, good to see you too,
man,” he said, somewhat perfunctorily. “Sorry, just give me one –”</p>
<p>“Can I have a sip of your coffee?” Kevin interrupted. The words rushed out
of Kevin’s mouth before any filter could catch them or any social grace
(of which Kevin somehow had some amount of) could interfere. A young
woman from the next table looked up in surprise at Kevin’s abruptness,
and then quickly looked away again, pointedly not eavesdropping.</p>
<p>Peter finally looked up from his phone, opened his mouth to talk, thought
better, and closed it again. A second later, he found his words. “Um,
sorry, I’m still being COVID conscious. You know, the waiter will be
right back and then I’m sure you can order your own.”</p>
<p>“Ah,” said Kevin, simply, slightly embarrassed but not so slightly
disappointed in his friend’s dire, even soulless lack of charity in this
troubling time.</p>
<p>Peter looked up at Kevin and decided everything was normal after
all. “Rough day?” he asked.</p>
<p>“You can say that again,” said Kevin. “Two coffee shops were closed.”</p>
<p>Peter smiled. “Well, I’m sure the waiter will be back soon. I do need to
sneak off to the restroom for a moment, though.” Peter stood up
and walked away from the table, calling back “But no stealing! Just wait
for the waiter.”</p>
<p>Kevin sat patiently for a minute before everything happened. And I would
like to just take this opportunity to point out that really, Kevin is
to blame here. Not only was he doing something morally impermissible,
in stealing his friend’s coffee and spreading his spit and germs,
but he was showing a shocking lack of patience – the waiter was even
then walking towards the table, ready to take this late-comer’s drink
order. That, and, Kevin had to know by now, even if he wouldn’t admit
it, that the more drastically he tried to fight the curse, the more
drastic the consequences would also be – and Kevin was responsible,
and is responsible, for all of them.</p>
<p>But mortals never learn anything, my lady and lords. And we can’t take
responsibility for their mistakes. For my curse to work, this action
could not go unresponded to. And so, as the coffee moved towards his lips,
and his heart rose in his chest as he smelled the familiar smell,
a fire alarm went off: “Whoo-OOP! Whoo-OOP!”</p>
<blockquote>
<p>LET THE RECORD STATE THAT AT THIS POINT THE ENTIRE COURTROOM
BEGINS TO GLOW AS THE SHRILL SOUND OF A FIRE ALARM FILLED IT.
–<strong>GEORGE</strong>, CLERK</p>
</blockquote>
<p>Kevin’s hand twitched and the coffee flowed back to the bottom of the
cup. Should he evacuate? Did it make sense to go into the building, where
there might be a fire, rather than just wait out on the patio?
While he was considering this, however, the young woman who was not
eavesdropping on Kevin rushed past him, fleeing towards the fire,
knocking the cup out of Kevin’s hand and shattering it against the
ground.</p>
<p>Not one to resist peer pressure, Kevin also ran into the building, which
was actually burning, and where the grease fire soon gave all of our
mortal characters severe burns. This sent Kevin to the hospital, where
he finally learned wisdom, and since has only ordered tea. Perhaps all
the mortals involved will learn some manners from this incident.</p>
<p>I do not enjoy mortals, my lady and lords, nor do I sympathize with them.
That, however, does not make me a criminal. The curse was, according to
our customs, reasonable and proportionate. It was only Kevin’s willful
defiance of it that resulted in this mayhem, and therefore, he was the
assailant as well as a victim.</p>
<p>My lady and lords, I rest my case.</p>
<blockquote>
<p>AT THIS POINT THE DEFENDANT VANISHES. THE JUSTICES DO NOT SEEM SURPRISED
OR DISTURBED. HOW COULD I HAVE SO VASTLY MISUNDERSTOOD THE REQUIREMENTS
FOR THIS JOB? –<strong>GEORGE</strong>, CLERK</p>
</blockquote>
<p><strong>LADY JUSTICE XYXXYZ:</strong> Thank you. We will now take a brief recess.</p>
<p>[END TRANSCRIPTION]</p>
On ADHD Medicationhttps://www.thecodedmessage.com/posts/meds/2023-07-03T00:00:00+00:00Here’s a story; stop me if you’ve heard it before.
There’s a child, an energetic, enthusiastic child, perhaps hard to deal with in some ways, but all around just beautiful. And then they go to a parochial school – or perhaps they just have a rather strict public school teacher. In either case, the authority figure makes it their wicked mission to suppress all the beautiful children’s personalities into identical, well-behaved zombies in the interest of the idol of order.<p>Here’s a story; stop me if you’ve heard it before.</p>
<p>There’s a child, an energetic, enthusiastic child, perhaps hard to deal
with in some ways, but all around just beautiful. And then they go to
a parochial school – or perhaps they just have a rather strict public
school teacher. In either case, the authority figure makes it their
wicked mission to suppress all the beautiful children’s personalities into
identical, well-behaved zombies in the interest of the idol of order. Only
our heroic child remains with their own personality, constantly getting
in trouble for it but remaining themselves.</p>
<p>In the next stage of the story, the villain makes their move. A teacher,
or a principle or a school nurse can’t handle the child, who admittedly
can be a handful sometimes. They suggest the child has ADHD, and put
the poor child on medication. Now, with the power of the conformity pill,
this child’s beautiful flowering of personhood has been bleached to the
same level as all the other children, “proper” and “well-behaved” –
which is to say, boring. And perhaps that is “just a “shame – or perhaps
a heroic parent removed the medication, or some other “happy ending”
intervenes in the third act.</p>
<p>In any case, the story is concluded with self-assured tsking against
those who would pathologize childhood and good spirits, and maybe
against the overdiagnosis of ADHD … or, if the tellers are bold,
the entire concept of ADHD. The moral is clear: Keep the zombie drugs
away from our amazing, perfectly normal children.</p>
<p>I’ve heard this story many times. I’ve read this story. I’ve heard this
story first-hand or second-hand or as rumor, in in-person conversations
and on Facebook posts, from parents and family and friends. Sometimes,
people tell me some variation of this when I tell them I’ve started taking
ADHD medication – an odd choice, given that few of them are really close
enough for it to be appropriate for them to try to undermine my medical
decisions. I will say, however, that it doesn’t count in my mind when
people tell this story about themselves – that is either second-hand
from parents’ framing of the narrative, or a different (but rarer than
you might think) effect which deserves an entirely different blog post.</p>
<p>But I think that the story as commonly told has some huge gaps. Or
rather, that we’re getting the wrong moral out of it by not thinking
critically about what’s going on. Obviously, from the fact that I take
ADHD medication, I think it is a good thing, often necessary, often
useful. In this I include stimulants (even though for unrelated reasons
I’m not on stimulants). So of course, I get a different take-away from
this story.</p>
<p>This is difficult to explain, because I do know that Adderall and
other ADHD medications do sometimes have unwanted personality-altering
side effects. I am also not sympathetic with the villain in the story
– I am also not a fan of the near-performative overconformity of
parochial schools, nor am I enamoured of “strict” or overly “disciplined”
environments for children, no matter how their brains work. But in spite
of all of these caveats, I still don’t buy into the premise that, in
this story, the school used medication to “turn the child into a zombie.”</p>
<p>Here’s the key point: In this story, generally, all the children,
medicated or not, are eventually turned into zombies. Normally, whether
spelled out or implied, we understand that the school or teacher only
has to resort to medication for its zombification for one child, or
perhaps a few children. What zombifies the other children? Or, from
the school’s perspective, makes them well-behaved? It is not the ADHD
medication that makes the unmedicated children behave “like a zombie,”
or even the medicated children, but rather some form of social pressure.</p>
<p>So why doesn’t the social pressure work on the protagonist of the story
without medication? I do think if they act differently than all the
other children without ADHD medication, and then the same as the other
children with ADHD medication, that probably means they do have ADHD. And
if there’s so much social pressure on these children that all the other
ones behave in an orderly fashion, and the ADHD child does not, that
probably really is bad for the ADHD child. They’re probably not enjoying
their flowery personality in such an environment. They certainly still
get all the downsides of the social pressure – without the upside
of even having the ability to conform to it.</p>
<p>See, most children are capable of behaving at various levels of enthusiasm
and mutedness, chaos and orderliness. The other children in the class
know that, in this school, they are expected to behave a certain way. The
ADHD child surely also knows that, but they find that they cannot. Their
unmedicated behavior isn’t some flowering of their true self or rebellion
in favor of being human – it’s a sign that they can’t do something the
other children can. It’s a sign of their disability.</p>
<p>The fact that the child’s behavior changes with the medication takes on a
new interpretation in this context. The medication doesn’t turn the child
into a zombie; it gives the child self-control. In the social siutuation
of a “strict” school, the child chooses to use that self-control to
conform and act as a “zombie” – for the same reason the other children
are conforming.</p>
<p>Here’s where that matters: Imagine what the child could do with that
medication, and that improved self-control, in another environment!
ADHD isn’t just about whether a child is frustrating to overly strict
teachers – that’s just one outward effect, and a relatively minor
one at that. In another environment, they will be able to show their
personality (like other kids would), but will also be able to use that
self-control to accomplish their goals. When they’re older, they’ll
be able to finish larger projects, persue their interests, and live
more satisfying lives, because far from being overblown or made up,
<a href="https://www.thecodedmessage.com/posts/adhd-philosophy/">ADHD is a serious disorder</a> that affects
much more than the ability of children to become obedient in service to
strict adults.</p>
<p>And, of course, if left untreated, many children with ADHD will have
difficulty actually behaving well even in a non-strict environment. If
a school is so strict it makes the children into conformist zombies,
it has gone too far, but children do in fact need some level of
discipline, to prevent them from doing harm to themselves and to
others. In many cases, the medication not only helps the children
conform to overly strict authority figures, but also to reasonable ones,
a goal we should all be on board with.</p>
<p>So, if ADHD medication is called for, for yourself or for your children,
please don’t avoid it because of this trope.</p>
<p>Now, I’m not a doctor, and I don’t mean to say that you shouldn’t be
careful with medication. Zombie-like feeling and behavior can indeed
be a sign of bad dosage or a bad medication match. But if by “zombie”
you mean what the school would call “good behavior,” there is another,
perhaps even more likely explanation: that the social pressures were
such that any child who could behave like that would, and now the
ADHD child can.</p>
<p>And here’s one way I know something about this: Because I had similar
concerns when I started medication myself, as an adult. I asked my friends
to pay careful attention to my personality, and whether anything about
it changed. I was very concerned about inadvertant personality changes,
and wanted my friends to pay special attention to that – while I paid
attention to it in myself as well.</p>
<p>The dreaded personality changes never came. But some non-dreaded
personality changes did come, with the increased self-control. I became
less anxious, and less likely to randomly demand that my friends explain
to me how they don’t hate me. And as I gradually increased the dosage
– as you have to do with Strattera – I noticed other changes, changes
that might have potentially been seen by some as negative or concerning,
but which from an internal perspective were clearly positive.</p>
<p>What do I mean by this? Let me tell you what sort of changes I’m talking
about. For example, I’ve been less outgoing. I’ve been less outgoing in
the literal sense of going out less, and also in the sense of spending
more of my at-home time alone, rather than on the phone. But this isn’t
because I enjoy those things less; rather, it’s because I’m enjoying my
alone time more. It’s because I’m better able to leverage my planning
and self-control skills towards goals, goals like saving money and not
eating too much.</p>
<p>See, I have historically had so much trouble doing things at home on my
own. I have a lot of things I’d love to do more of, things that I clearly
know how to do and can do, but which I only get myself started on if
other people are around. Leveraging the presence of other people is a
common ADHD coping technique known as “body doubling,” and it is one of
many techniques to do the types of tasks which ADHD makes difficult –
which at some level, is most tasks.</p>
<p>So before I went on medication I would spend as much time out and about
as possible. Need to do work? Go to the coffee shop. Need to read a
book? Go to the bar. Need to figure out some thoughts? Discuss them
with someone. Need to clean my house? Invite someone over to clean it,
or even just to hang out with me while I clean it – that works too, and
makes me feel a little less bad. I could do with just the encouragement,
similar to a personal trainer who may be there more to nag you into
exercising than actually educate you about it in any way.</p>
<p>But now I’m medicated, and I’ve finally found a dosage that works
for me. It’s not 100% better, of course, but it’s a vast improvement.
And that means I’m suddenly getting work done at home. That means I’m
occasionally even cleaning my own house. That means that I’m suddenly
actually getting use out of my alone time – and so I am taking much
more of it.</p>
<p>But sometimes, I worry that this may be a personality change, and a
bad one at that. Am I losing my charm? My outgoing nature? I briefly
get trapped by the narrative above, which I have heard so many times,
and I think, “Oh no! My ADHD medication has made me boring.” But then I
realize I could, if I so chose, go out as much as I used to. It’s just
that the other options got better. The medication is just helping me.</p>
<p>And I am so grateful for it. Before, it was like I had a menu of fun
things I could do for no cost in terms of extreme effort (basically all
social), and also a menu of fun things that I’d theoretically like to
be able to do, but would be so difficult to wrangle myself into doing
them that they were out of the question, beyond special occasions or
situations where other people were around. Making myself clean or even
practice piano had become analogous to going to a nice restaurant very
occasionally to treat yourself. I had to build my day-to-day life out
of the easy tasks, which was mostly the social tasks – not exactly fun,
and confusing for the people around me.</p>
<p>But now, the whole menu of activities is available to me all the time
– or at least more of it. The harder tasks still take some wrangling,
but the wrangling is way easier. That means I stay in more, and there’s
probably other changes in my personality that in isolation seem negative,
both my apparent personality and the way I approach things. But these
changes are usually because I can actually accomplish my goals.</p>
<p>And so, I am deeply grateful for my medication, and that is why I am so
sad when ill-thought out narratives perpetuate stigma against medication,
especially for children who can’t make their own medication decisions.
The effects of ADHD medication can be wide-ranging and complicated, so
it’s important to think critically in evaluating them. Anti-medication
narratives are often emotionally compelling but ultimately oversimplified,
ignoring alternative explanations for what happens. So it’s important
to actually think them through, and pull them apart.</p>
<p>Hopefully reading this has provided some practice doing so.</p>
Walk-Through: Prefix Ranges in Rust, a Surprisingly Deep Divehttps://www.thecodedmessage.com/posts/prefix-ranges/2023-06-24T00:00:00+00:00Update: Arvid Norlander has gone through the trouble of refactoring this code into a crate and publishing it. Thank you, Arvid!
Rust’s BTreeMap and corresponding BTreeSet are excellent, B-tree-based sorted map and key types. The map implements the ergonomic entry API, more flexible than other map APIs, made possible by the borrow checker. They are implemented with the more performant but more gnarly B-Tree data structure, rather than the more common AVL trees or red-black trees.<p><em><strong>Update:</strong> <a href="https://vorpal.se/">Arvid Norlander</a> has
gone through the trouble of refactoring this code into a
<a href="https://crates.io/crates/prefix-range">crate</a> and publishing it. Thank
you, Arvid!</em></p>
<p>Rust’s <code>BTreeMap</code> and corresponding <code>BTreeSet</code> are excellent,
B-tree-based sorted map and key types. The map implements
the ergonomic <a href="https://www.thecodedmessage.com/posts/rust-map-entry/">entry API</a>, more flexible
than other map APIs, made possible by the borrow checker.
They are implemented with the more performant but more gnarly
<a href="https://en.wikipedia.org/wiki/B-tree">B-Tree</a> data structure, rather than
the more common <a href="https://en.wikipedia.org/wiki/AVL_tree">AVL trees</a> or
<a href="https://en.wikipedia.org/wiki/Red%E2%80%93black_tree">red-black trees</a>.
All in all, they are an excellent piece of engineering, and an excellent
standard library feature.</p>
<p>But they aren’t perfect, as I learned recently when I had a very specific
operation that I needed to perform on one. I scanned the method lists
diligently, trying to find the one I needed, but it was not there. <code>range</code>
was close, but not quite there, and so I would simply have to implement
the operation by hand. <code>range</code> is defined based on a start key (where,
at our option, it includes keys that are greater than or equal to that
key, or strictly greater than that key) and an end key (where the keys in
the range are either less than or equal, or strictly less than that key).</p>
<p>Here is an example of the use of <code>range</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> set <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> set <span style="color:#f92672">=</span> BTreeSet::new();
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"ABC"</span>);
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"DEF"</span>);
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"DEG"</span>);
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"HIJ"</span>);
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"KLM"</span>);
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"NOP"</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> set
</span></span><span style="display:flex;"><span>};
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">for</span> elem <span style="color:#66d9ef">in</span> set.range(<span style="color:#e6db74">"DEF"</span><span style="color:#f92672">..</span><span style="color:#e6db74">"N"</span>) {
</span></span><span style="display:flex;"><span> println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{elem}</span><span style="color:#e6db74">"</span>);
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>It outputs starting with <code>"DEF"</code>, continuing in order
through the set, but not including <code>"NOP"</code>, as that is
greater than <code>"N"</code> (lexigraphically and therefore according to <code>&str</code>’s
<code>Ord</code> instance). If <code>"N"</code> were in the set, it would not be printed,
as <code>..</code> is exclusive on the right side. <code>..=</code> would include it.</p>
<blockquote>
<p><strong>Maps and sets: A brief aside</strong></p>
<p>This discussion only concerns the keys of a map.
For simplicity’s sake, throughout the discussion, I’ll be using
<code>BTreeSet</code>, a wrapper around <code>BTreeMap</code> for when there are
just keys (that are still unique and sorted) and no values. Internally,
it contains a <code>BTreeMap</code> with the zero-sized struct <code>SetValZST</code> as its
value type.</p>
</blockquote>
<h1 id="the-problem">The Problem</h1>
<p>But that isn’t the exact operation I needed. I needed all of the keys
(which were also <code>String</code>) that started with a certain prefix. So, if
the set was as in the example above, and the prefix was <code>"DE"</code>, this
operation would give me <code>"DEF',"DEG"</code>. As you can see from the example,
and as is easy to prove in general, when the keys are sorted, all the
keys starting with a prefix form a contiguous range. But it is not
a range that can be expressed with the <code>range</code> operation.</p>
<p>It’s close, tantalizingly close. Due to the definition of <code>Ord</code> on
<code>String</code>, our prefix-based range starts with the first key that is
greater than or equal to the prefix, as strings starting with a prefix
always compare greater to or equal to the prefix. This side of the range
is therefore expressable with the <code>range</code> operation.</p>
<p>It’s the other side that causes the problem. We don’t have a key where
all the keys in the prefix are less than that key. We know that once we
hit a key string that doesn’t start with the prefix, it must be greater
than all the keys that do, as must all subsequent ones, but we cannot
express this bound easily in terms of the prefix. We would need an element
that is either the greatest possible key that starts with that prefix,
or else the least possible key that does not.</p>
<p>There is a lot of efficiency to be gained by taking advantage of the fact
that the range we want is contiguous, which is why the <code>range</code> method exists.
But there is no operation that covers this scenario, because of the
narrowness of how the <code>range</code> operation is defined.</p>
<p>On the one hand, this is frustrating. We are so close to being able to
do this straight-forwardly with the provided operations. It also seems
like it would be more performant to determine the bounds of that range
by doing a tree search, rather than trying to implement this operation by
hand. Without this operation being available, we seem doomed to slowness.</p>
<p>On the other hand, it’s understandable. The key type of a map is only
really expected to implement the <code>Ord</code> trait, and nothing about <code>Ord</code> has
anything to do with prefixes. Creating ranges with <code>range</code> was allowed,
but based on inclusive and exclusive bounds, which is to say, purely
based on ordering of opaque elements. Evaluating a prefix as a range,
on the other hand (or even merely proving that the keys forming a prefix
do indeed constitute a contiguous range) would be outside of the scope
of the operations represented by the <code>Ord</code> trait.</p>
<p>So I needed a way of getting keys that start with a specific prefix.
So what did I do? I simply coded a manual form of the operation,
looping starting from the beginning of the range, and checking each
iteration whether we’d left the range yet:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">for</span> key <span style="color:#66d9ef">in</span> keys.range::<span style="color:#f92672"><</span>String, _<span style="color:#f92672">></span>((Bound::Included(prefix), Bound::Unbounded)) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#f92672">!</span>key.starts_with(prefix) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">break</span>; <span style="color:#75715e">// We've gone past the end of the range
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ... Actually do something with the key
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>This seemed reasonable enough. My colleagues asked me to put in a comment
to clarify that, since the map was sorted, all the items with a prefix
would be contiguous, and therefore <code>break</code> was correct and not <code>continue</code>.
It worked, and was performant enough for my purposes in writing the code,
but perhaps not as much as ideally could be achieved. I couldn’t help
but wonder if it could be made a little more performant if it were part of the
standard library, if we had insight into and ability to access the inner
structure of how a <code>BTreeSet</code> is laid out. Obviously, in such a case
the code would also be more concise, and (more importantly) obviously
correct, without need for a comment.</p>
<p>The performance considerations, if present, however, would be
minimal. Looping through a <code>BTreeSet</code> is a reasonable operation, and I
took advantage of the fact that my range was contiguous to stop once we’d
gone past the last item. At best, explicit library support for prefixes
would simply detect this condition slightly sooner, further up in the
tree, without having to actually find the node with the offending item.</p>
<p>The next bit of code I wrote was for a closely related operation: dropping
values outside of the prefix. What I wrote seemed like it definitely
would be substantially less performant than a specially coded operation
from the standard library would be. It certainly was harder to prove correct:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">prefixed</span>(<span style="color:#66d9ef">mut</span> set: <span style="color:#a6e22e">BTreeSet</span><span style="color:#f92672"><</span>String<span style="color:#f92672">></span>, prefix: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> <span style="color:#a6e22e">BTreeSet</span><span style="color:#f92672"><</span>String<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> set <span style="color:#f92672">=</span> set.split_off(prefix);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> not_in_prefix <span style="color:#f92672">=</span> (<span style="color:#f92672">&</span>set).iter().find(<span style="color:#f92672">|</span>s<span style="color:#f92672">|</span> <span style="color:#f92672">!</span>s.starts_with(prefix));
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> not_in_prefix <span style="color:#f92672">=</span> not_in_prefix.map(<span style="color:#f92672">|</span>s<span style="color:#f92672">|</span> s.to_owned());
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#66d9ef">let</span> Some(not_in_prefix) <span style="color:#f92672">=</span> not_in_prefix {
</span></span><span style="display:flex;"><span> set.split_off(<span style="color:#f92672">&</span>not_in_prefix);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> set
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This uses two calls to <code>split_off</code>, which like <code>range</code> needs a concrete
<code>T</code>, a concrete <code>String</code>, to serve as a comparison-point for where to
split. And it is certainly less performant than a dedicated method would
have been, as it also uses a call to <code>find</code> to find a concrete <code>String</code>
for the end of the range, which constitutes an additional loop through
all the strings in the range.</p>
<h1 id="questions">Questions</h1>
<p>This raised two questions in my mind:</p>
<ol>
<li>
<p>Is there a way to convert a prefix into a range that can be used with
<code>range</code> and <code>split_off</code>? More concretely, is there a way to construct
a <code>String</code> such that it is the least possible <code>String</code> that is still
greater than all the possible strings that start with our prefix,
but less than or equal to all strings that do not? Would doing so
in fact improve performance?</p>
</li>
<li>
<p>How hard would it be to add this feature to the standard library,
both for iterating and for splitting the set?</p>
</li>
</ol>
<p>In this blog post, we will focus on the first question. The second
question is reserved for a future blog post.</p>
<h1 id="testing-prefixed">Testing <code>prefixed</code></h1>
<p>The <code>prefixed</code> function needs the optimization more than the loop,
so we’ll focus on that in our discussion. And as we’re discussing an
optimization of the <code>prefixed</code> function, and as it is in any case a
gnarly function, we will want to write some unit tests for it.</p>
<p>Here’s one example:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[test]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">it_works</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> set <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> set <span style="color:#f92672">=</span> BTreeSet::new();
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"Hi"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"Hey"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"Hello"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"heyyy"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">""</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"H"</span>.to_string());
</span></span><span style="display:flex;"><span> set
</span></span><span style="display:flex;"><span> };
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> set <span style="color:#f92672">=</span> prefixed(set, <span style="color:#e6db74">"H"</span>);
</span></span><span style="display:flex;"><span> assert_eq!(set.len(), <span style="color:#ae81ff">4</span>);
</span></span><span style="display:flex;"><span> assert!(<span style="color:#f92672">!</span>set.contains(<span style="color:#e6db74">"heyyy"</span>));
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This probably isn’t enough. Additional unit tests will be left as an
exercise to the reader.</p>
<h1 id="constructing-an-upper-bound">Constructing an upper bound</h1>
<p>So, let us return to our example. In our example, the prefix was <code>"DE"</code>.
As discussed, the lower bound is easy: Everything that starts with a
<code>"DE"</code> is greater than or equal to <code>"DE"</code>. Strings outside of the range
to the left will not:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DD"</span> <span style="color:#f92672">>=</span> <span style="color:#e6db74">"DE"</span>); <span style="color:#75715e">// Prints "false"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DE"</span> <span style="color:#f92672">>=</span> <span style="color:#e6db74">"DE"</span>); <span style="color:#75715e">// Prints "true"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DEF"</span> <span style="color:#f92672">>=</span> <span style="color:#e6db74">"DE"</span>); <span style="color:#75715e">// Prints "true"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DEG"</span> <span style="color:#f92672">>=</span> <span style="color:#e6db74">"DE"</span>); <span style="color:#75715e">// Prints "true"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DF"</span> <span style="color:#f92672">>=</span> <span style="color:#e6db74">"DE"</span>); <span style="color:#75715e">// Still prints "true" -- need something
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"NOP"</span> <span style="color:#f92672">>=</span> <span style="color:#e6db74">"DE"</span>); <span style="color:#75715e">// Still prints "true" -- need something
</span></span></span></code></pre></div><p>The upper bound is also easy enough, actually – we just need to
increment the last character. Anything that starts with a <code>"DE"</code>
will also compare strictly less to <code>"DF"</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DE"</span> <span style="color:#f92672"><</span> <span style="color:#e6db74">"DF"</span>); <span style="color:#75715e">// Prints "true"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DEF"</span> <span style="color:#f92672"><</span> <span style="color:#e6db74">"DF"</span>); <span style="color:#75715e">// Prints "true"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DEG"</span> <span style="color:#f92672"><</span> <span style="color:#e6db74">"DF"</span>); <span style="color:#75715e">// Prints "true"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"DF"</span> <span style="color:#f92672"><</span> <span style="color:#e6db74">"DF"</span>); <span style="color:#75715e">// Prints "false"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>println!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{}</span><span style="color:#e6db74">"</span>, <span style="color:#e6db74">"NOP"</span> <span style="color:#f92672"><</span> <span style="color:#e6db74">"DF"</span>); <span style="color:#75715e">// Prints "false"
</span></span></span></code></pre></div><p>This seems easy enough to handle. We just need to write a function that
increments the last character in a string, something with this
signature:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">upper_bound_from_prefix</span>(prefix: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> String;
</span></span></code></pre></div><p>Incrementing the last character in a string seems like it’s
just a matter of incrementing the last byte, so let’s see
what that looks like:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">upper_bound_from_prefix</span>(prefix: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> String {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> prefix <span style="color:#f92672">=</span> prefix.to_string();
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">unsafe</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// SAFETY: It is not. ☹️. XXX
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">let</span> prefix_bytes <span style="color:#f92672">=</span> prefix.as_bytes_mut();
</span></span><span style="display:flex;"><span> prefix_bytes[prefix_bytes.len() <span style="color:#f92672">-</span> <span style="color:#ae81ff">1</span>] <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> prefix
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Well, that’s not good. It passes the unit test I wrote, but that’s because
we need to write more unit tests. Unfortunately, like many programmers
before us, we have forgotten about UTF-8. Rust requires all its strings
to be stored as valid UTF-8 as a safety invariant. Fortunately, because
we’re using Rust, we notice that we’re violating this invariant when an
operation we have to invoke is marked as <code>unsafe</code>.</p>
<p>In order to capture this failure, we would have to write a unit
test where the prefix ends in a multi-byte Unicode character.
Unfortunately, because this is a safety issue, the test might not
even fail (but it might be worth doing as an exercise anyway).</p>
<p>That isn’t even to mention the possibility that the prefix is empty,
which would result in a panic in this code!</p>
<p>So, how can we get the last character of a string? <code>get</code> allows us
to do substrings with byte indexes, but returns <code>None</code> if it is not a
valid substring. We can loop backwards until we find an index that
works for the split, and we can return an option in case the string
is empty:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">upper_bound_from_prefix</span>(prefix: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> Option<span style="color:#f92672"><</span>String<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> i <span style="color:#66d9ef">in</span> (<span style="color:#ae81ff">0</span> <span style="color:#f92672">..</span> prefix.len()).rev() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#66d9ef">let</span> Some(last_char_str) <span style="color:#f92672">=</span> prefix.get(i<span style="color:#f92672">..</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> rest_of_prefix <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span> debug_assert!(prefix.is_char_boundary(i));
</span></span><span style="display:flex;"><span> prefix[<span style="color:#ae81ff">0</span><span style="color:#f92672">..</span>i]
</span></span><span style="display:flex;"><span> };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ???
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> None
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>But that gives us two <code>str</code>s, and we want to increment a char.
So we have to extract the singular <code>char</code> from the <code>last_char_str</code>,
which we know to have exactly one <code>char</code> in it. Looking over the
operations of <code>str</code>, we have only one real option:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> last_char <span style="color:#f92672">=</span> last_char_str
</span></span><span style="display:flex;"><span> .chars()
</span></span><span style="display:flex;"><span> .next()
</span></span><span style="display:flex;"><span> .expect(<span style="color:#e6db74">"last_char_str will contain exactly one char"</span>);
</span></span></code></pre></div><h1 id="walking-through-chars">Walking Through <code>char</code>s</h1>
<p>But once we do have a <code>char</code>, we cannot simply do <code>+ 1</code> on it. This
operation isn’t defined on a <code>char</code>. And before you say that we should
convert it to <code>u32</code> and back, you should know that the operation is
left undefined on <code>char</code> for a reason. <code>char</code>s are supposed to remain
valid Unicode code points.</p>
<p>So, we must do something else that will skip over invalid code points.
There is no obvious operation in <code>char</code> that will do it, but if
we look in the “Trait Implementations” section, we find something
that looks potentially relevant: <code>Step</code>. And looking at
<code>char</code>’s implementation of <code>Step</code>, we see the
<a href="https://github.com/rust-lang/rust/blob/ee8b035faba3728644d36fbb689feb8047b965e6/library/core/src/iter/range.rs#L426-L440">exact function</a>
we want:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">forward_checked</span>(start: <span style="color:#66d9ef">char</span>, count: <span style="color:#66d9ef">usize</span>) -> Option<span style="color:#f92672"><</span><span style="color:#66d9ef">char</span><span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> start <span style="color:#f92672">=</span> start <span style="color:#66d9ef">as</span> <span style="color:#66d9ef">u32</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> res <span style="color:#f92672">=</span> Step::forward_checked(start, count)<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> start <span style="color:#f92672"><</span> <span style="color:#ae81ff">0xD800</span> <span style="color:#f92672">&&</span> <span style="color:#ae81ff">0xD800</span> <span style="color:#f92672"><=</span> res {
</span></span><span style="display:flex;"><span> res <span style="color:#f92672">=</span> Step::forward_checked(res, <span style="color:#ae81ff">0x800</span>)<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> res <span style="color:#f92672"><=</span> <span style="color:#66d9ef">char</span>::<span style="color:#66d9ef">MAX</span> <span style="color:#66d9ef">as</span> <span style="color:#66d9ef">u32</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// SAFETY: res is a valid unicode scalar
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// (below 0x110000 and not in 0xD800..0xE000)
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> Some(<span style="color:#66d9ef">unsafe</span> { <span style="color:#66d9ef">char</span>::from_u32_unchecked(res) })
</span></span><span style="display:flex;"><span> } <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> None
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Unfortunately, this gives us an <code>Option</code>. Why? Well, you can see that from
the code: What if <code>last_char</code> is the highest possible Unicode code point,
<code>0x10FFFF</code>, also known as <code>char::MAX</code>? We’re going to procrastinate
handling this (admittedly rare) situation, and panic for now. Spoiler:
Fortunately, there is a solution, which we will discuss later.</p>
<p>This is a great example of why Rust is great. Because this operation
is defined to return an <code>Option</code>, we have to explicitly say what
we’re doing in case it returns <code>None</code>. We don’t even have to have
a unit test for <code>0x10FFFF</code> code-points in our prefix to realize
that we have to cover this case (although now would be
a great time to write one).</p>
<p>Also unfortunately, we can’t directly call <code>forward_checked</code> … not
if we want to use stable Rust, in any case. It’s marked as a nightly-only
“unstable API.” Fortunately, however, we can access it indirectly,
through the <code>Range</code> API. Some rooting around in the standard library
reveals that <code>nth</code>, on an iterator on a closed range, calls
<code>forward_checked</code>, yielding :</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> last_char_incr <span style="color:#f92672">=</span> (last_char <span style="color:#f92672">..=</span> <span style="color:#66d9ef">char</span>::<span style="color:#66d9ef">MAX</span>)
</span></span><span style="display:flex;"><span> .nth(<span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span> .expect(<span style="color:#e6db74">"XXX fixme: can't handle highest possible codepoint"</span>);
</span></span></code></pre></div><p>This actually works, with the caveat of handling <code>char::MAX</code> set
aside. All my unit tests except my <code>0x10FFFF</code> one pass. Altogether,
here is the state of things:
We have a <code>prefixed</code> function that uses this to call
<code>split_off</code> with an appropriate value, without iterating
through all the strings in range in the set:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">upper_bound_from_prefix</span>(prefix: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> Option<span style="color:#f92672"><</span>String<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> i <span style="color:#66d9ef">in</span> (<span style="color:#ae81ff">0</span><span style="color:#f92672">..</span>prefix.len()).rev() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#66d9ef">let</span> Some(last_char_str) <span style="color:#f92672">=</span> prefix.get(i<span style="color:#f92672">..</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> rest_of_prefix <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span> debug_assert!(prefix.is_char_boundary(i));
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&</span>prefix[<span style="color:#ae81ff">0</span><span style="color:#f92672">..</span>i]
</span></span><span style="display:flex;"><span> };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> last_char <span style="color:#f92672">=</span> last_char_str
</span></span><span style="display:flex;"><span> .chars()
</span></span><span style="display:flex;"><span> .next()
</span></span><span style="display:flex;"><span> .expect(<span style="color:#e6db74">"last_char_str will contain exactly one char"</span>);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> last_char_incr <span style="color:#f92672">=</span> (last_char<span style="color:#f92672">..</span>)
</span></span><span style="display:flex;"><span> .nth(<span style="color:#ae81ff">1</span>)
</span></span><span style="display:flex;"><span> .expect(<span style="color:#e6db74">"XXX fixme used highest possible codepoint"</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> new_string <span style="color:#f92672">=</span> format!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{rest_of_prefix}{last_char_incr}</span><span style="color:#e6db74">"</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> Some(new_string);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> None
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">prefixed</span>(<span style="color:#66d9ef">mut</span> set: <span style="color:#a6e22e">BTreeSet</span><span style="color:#f92672"><</span>String<span style="color:#f92672">></span>, prefix: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> <span style="color:#a6e22e">BTreeSet</span><span style="color:#f92672"><</span>String<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> set <span style="color:#f92672">=</span> set.split_off(prefix);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#66d9ef">let</span> Some(not_in_prefix) <span style="color:#f92672">=</span> upper_bound_from_prefix(prefix) {
</span></span><span style="display:flex;"><span> set.split_off(<span style="color:#f92672">&</span>not_in_prefix);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> set
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h1 id="cleaning-up-the-edge-case">Cleaning Up the Edge Case</h1>
<p>OK, now that we’ve got something that (kind of) works, it’s time
to do some clean-up.</p>
<p>So, first, of course, we should address the <code>XXX fixme</code>,
the <code>0x10FFFF</code> case. So what do we do in that case? Well, if we
use <code>X</code> to stand in for this “highest code point character”,
we can reason about it a little.</p>
<p>Let’s say the prefix is <code>"deX"</code>. In order for something to be
out of the range of the prefix, it can’t start with <code>"deY"</code>, as
there is no <code>'Y'</code> character greater than <code>'X'</code>. So, it would
have to differ on the previous character. It would have to start
with <code>"df"</code> or greater.</p>
<p>So, if our prefix ends with this special character, we can
simply drop it, and move one character back, and increment
that character instead. Strangely enough, that just means going
through our <code>for</code> loop again (and no, I did not plan this). See,
if we keep going backwards to find another character to increment,
we’ll get the previous character. Our way of extracting characters
from the suffix works even if there’s more than one character in
the second substring – it’ll just get the first character, which is
exactly what we want.</p>
<p>So we can actually write:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> Some(last_char_incr) <span style="color:#f92672">=</span> (last_char <span style="color:#f92672">..=</span> <span style="color:#66d9ef">char</span>::<span style="color:#66d9ef">MAX</span>).nth(<span style="color:#ae81ff">1</span>) <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">continue</span>;
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>Adding some comments to explain, and adjusting existing code to no
longer lie to the reader (<code>last_char_str</code> might now contain more than
one character) we get this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">upper_bound_from_prefix</span>(prefix: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> Option<span style="color:#f92672"><</span>String<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> i <span style="color:#66d9ef">in</span> (<span style="color:#ae81ff">0</span><span style="color:#f92672">..</span>prefix.len()).rev() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#66d9ef">let</span> Some(last_char_str) <span style="color:#f92672">=</span> prefix.get(i<span style="color:#f92672">..</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> rest_of_prefix <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span> debug_assert!(prefix.is_char_boundary(i));
</span></span><span style="display:flex;"><span> <span style="color:#f92672">&</span>prefix[<span style="color:#ae81ff">0</span><span style="color:#f92672">..</span>i]
</span></span><span style="display:flex;"><span> };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> last_char <span style="color:#f92672">=</span> last_char_str
</span></span><span style="display:flex;"><span> .chars()
</span></span><span style="display:flex;"><span> .next()
</span></span><span style="display:flex;"><span> .expect(<span style="color:#e6db74">"last_char_str will contain at least one char"</span>);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> Some(last_char_incr) <span style="color:#f92672">=</span> (last_char <span style="color:#f92672">..=</span> <span style="color:#66d9ef">char</span>::<span style="color:#66d9ef">MAX</span>).nth(<span style="color:#ae81ff">1</span>) <span style="color:#66d9ef">else</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Last character is highest possible code point.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// Go to second-to-last character instead.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">continue</span>;
</span></span><span style="display:flex;"><span> };
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> new_string <span style="color:#f92672">=</span> format!(<span style="color:#e6db74">"</span><span style="color:#e6db74">{rest_of_prefix}{last_char_incr}</span><span style="color:#e6db74">"</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> Some(new_string);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> None
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>If our string contains only copies of this highest possible code point,
this returns <code>None</code>, which is appropriate because there will be no
strings greater than the strings prefixed with these characters, just like
there’s nothing that comes after names that start with “Z” in alphabetical
order, nor anything that comes after names that start with “Zz”.</p>
<p>Note that if we want to save the other sets that are created by
<code>split_off</code>, we can. We can easily modify this function to return all
three sets: The set of keys that come lexigraphically before the prefix, the
set that starts with the prefix, and the set of keys that come after the
keys that start with the prefix.</p>
<h1 id="performance">Performance</h1>
<p>This code certainly hasn’t been optimized to the fullest extent possible.
In such a case, we probably would want to do some more extreme
optimizations, like working with <code>Vec<u8></code> rather than <code>String</code>s, and
check if they were valid UTF-8 only at the point when it is necessary
(if it in fact is necessary for our application). Or, alternatively,
we might want to fork the standard library’s BTree implementation
and actually add this operation. Both of these are gnarly, but if the
absolute best possible performance was truly our goal, they would both be in
scope.</p>
<p>But I am reserving that for a future blog post. Detailed profiling
of different implementations of this operation would require that level
of optimization to be fully interesting and is therefore also reserved for
a future blog post. Instead, here, I will walk through some informal
reasoning about the performance of this new implementation of <code>prefixed</code>,
and whether it is also useful for iteration rather than splitting off
a new set.</p>
<p>So, let’s do some back-of-the-envelope reckoning. In creating this
upper bound, we had to reconstruct the prefix string, which costs us an
allocation as well as a string copy. In exchange, we saved an extra call
for <code>find</code>, which might have had to loop over many, many strings that
start with this prefix. We can expect this implementation of <code>prefixed</code> to
be more performant, therefore, in situations where there are many strings
that start with the prefix (and the prefix is not pathologically long).</p>
<p>For iterating over the range, however, we would be making an allocation,
and only potentially saving us some walking through the tree. Given that
allocations are expensive (and potentially also involve some amount of
walking around memory), it’s probably not going to be worth it unless
the tree is extremely large.</p>
<h1 id="a-warning-unto-the-test-shy">A Warning unto the Test-Shy</h1>
<p>In an earlier draft of this post, I had the following code
to increment a <code>char</code> rather than what I wrote above:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span>(last_char <span style="color:#f92672">..</span>).nth(<span style="color:#ae81ff">1</span>)
</span></span></code></pre></div><p>This seems like it should work, in spite of having no upper bound.
It stands to reason that <code>char::MAX</code> would, in such a case, serve
as an implicit upper bound. It does still return an <code>Option<char></code>,
and when would <code>None</code> happen if not in such a situation?</p>
<p>But fortunately, I had a test case:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[test]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">maxicode</span>() {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> set <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> set <span style="color:#f92672">=</span> BTreeSet::new();
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"Hi"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"Hey"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"Hello"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"heyyy"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"H</span><span style="color:#ae81ff">\u{10FFFF}</span><span style="color:#e6db74">eyyy"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"H</span><span style="color:#ae81ff">\u{10FFFF}</span><span style="color:#e6db74">"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"I"</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">""</span>.to_string());
</span></span><span style="display:flex;"><span> set.insert(<span style="color:#e6db74">"H"</span>.to_string());
</span></span><span style="display:flex;"><span> set
</span></span><span style="display:flex;"><span> };
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> set <span style="color:#f92672">=</span> prefixed(set, <span style="color:#e6db74">"H</span><span style="color:#ae81ff">\u{10FFFF}</span><span style="color:#e6db74">"</span>);
</span></span><span style="display:flex;"><span> assert_eq!(set.len(), <span style="color:#ae81ff">2</span>);
</span></span><span style="display:flex;"><span> assert!(<span style="color:#f92672">!</span>set.contains(<span style="color:#e6db74">"I"</span>));
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This test case, in that earlier code, actually panicked! It turns
out that in the case of an open-ended range like <code>(last_char ..)</code>,
which results in a value of the type <code>RangeFrom</code>, it is simply
assumed that going forward is possible. Instead of calling
<code>forward_checked</code>, its
<a href="https://github.com/rust-lang/rust/blob/ee8b035faba3728644d36fbb689feb8047b965e6/library/core/src/iter/range.rs#L880-L886"><code>nth</code> method</a>
calls <code>forward</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[inline]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">nth</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self, n: <span style="color:#66d9ef">usize</span>) -> Option<span style="color:#f92672"><</span>A<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> plus_n <span style="color:#f92672">=</span> Step::forward(self.start.clone(), n);
</span></span><span style="display:flex;"><span> self.start <span style="color:#f92672">=</span> Step::forward(plus_n.clone(), <span style="color:#ae81ff">1</span>);
</span></span><span style="display:flex;"><span> Some(plus_n)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>And in
<a href="https://github.com/rust-lang/rust/blob/ee8b035faba3728644d36fbb689feb8047b965e6/library/core/src/iter/range.rs#L87-L89"><code>forward</code></a>,
every <code>None</code> is converted into a panic:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">forward</span>(start: <span style="color:#a6e22e">Self</span>, count: <span style="color:#66d9ef">usize</span>) -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> Step::forward_checked(start, count).expect(<span style="color:#e6db74">"overflow in `Step::forward`"</span>)
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><h1 id="conclusion">Conclusion</h1>
<p>I hope you enjoyed this walk-through. You can find the final
version of <code>prefixed</code> and two test cases <a href="https://www.thecodedmessage.com/range.rs">here</a>.</p>
<p>Please let me know what you think of this format in the comments. Also
let me know if you have any follow-up topics you want me to explore,
or other problems you would want walk-throughs of.</p>
<p>And, of course, please feel free to provide corrections and even
nit-picks!</p>
Fiction Review: The Long Way to a Small Angry Planethttps://www.thecodedmessage.com/posts/long-way/2023-06-16T00:00:00+00:00I already enjoyed the Monk and Robot series by Becky Chambers (A Psalm for the Wild-Built and A Prayer for the Crown-Shy). It’s now one of my favorite books. so I was excited to also read her earlier work, the Wayfarer series, starting with The Long Way to a Small, Angry Planet, and it did not disappoint me.
Both these series are science fiction. While Monk and Robot is solarpunk, a relatively new sub-genre focused on imagining a world with major environmental (and economic) problems solved, the Wayfarer series much more reminds me of the kind of science fiction I used to read as a kid.<p>I already enjoyed the <em>Monk and Robot</em> series by Becky Chambers (<em>A Psalm
for the Wild-Built</em> and <em>A Prayer for the Crown-Shy</em>). It’s now one of
my favorite books. so I was excited to also read her earlier work, the
<em>Wayfarer</em> series, starting with <em>The Long Way to a Small, Angry Planet</em>,
and it did not disappoint me.</p>
<p>Both these series are science fiction. While <em>Monk and Robot</em> is solarpunk,
a relatively new sub-genre focused on imagining a world with major
environmental (and economic) problems solved, the <em>Wayfarer</em> series
much more reminds me of the kind of science fiction I used to read as
a kid. While it’s described as space opera, it reminds me more of
Heinlein or Arthur C. Clarke or even Niven, who are considered
hard sci fi. I’m not sure whether this is because it focuses
less on accuracy and logic than those other authors, or if it is because
it does not do so at the expense of character development, or perhaps
because it is written by a woman.</p>
<p>Nevertheless, in contrast the classic “space opera” clichés, it does
not focus on a war (though war is involved). In general, the stakes
are far lower than that, involving ordinary people with ordinary jobs,
interacting with and influencing events that affect the entire
fictional universe in relatively minor ways – outsized for a
completely normal person, but not “saving the world” or “overthrowing
the evil empire” or other typical space opera fare.</p>
<p>The aliens and the civilization in general is also a lot more
developed than typical space opera, where aliens are typically
humans with one twist each who live in normal human societies. Instead,
it has the full “hard sci fi” range of alien eccentricities, full of
philosophical exploration of how xenobiology might end up
being and an intricate inter-species galactic social balance.</p>
<p>In fact, not only is our trusty space ship crew made up of regular
people just trying to eke out a living, but humans in general
are just a regular species,
side stepping tropes where humans are the best species or the most
creative – or the most violent or the most evil.
Our space crew isn’t even entirely human (though it mostly is),
and we get a fair amount of explicit alien perspective.</p>
<p>Instead of humans, the privileged species that dominates galactic society,
the equivalent of the colonizing (or post-colonizing but still quite
privileged) white people in our world, are aliens who we would find quite
repugnant, but who until recently dominated large swaths of the galaxy
by force. Other species, including humans, try to emulate their ways and
are proud to learn their prestigious language. Meanwhile, the <em>lingua
franca</em> is not a human language, but most humans are forced to learn it.</p>
<p>This means that our mostly-human and human-led crew are normal
not-particularly-privileged people in a normal not-particularly-privileged
species. The sense of normalcy and “everyday folks” is refreshing in a
genre normally dominated by the powerful and those who become or fight
the powerful, and again, does not strike me as within the stereotype of
“space opera.”</p>
<p>It does in some ways remind me of <em>Serenity</em>, partially because
our trusty crew – with one special perspective character, their newest
member, who is decidedly not a protagonist but merely a window into
an ensemble cast – take on various jobs to make a living, and the
storytelling takes the form of episodic vignettes that take place in the
context of these jobs. While there is an overarching, overall plot, it is more
like the season plot of an episodic TV show than the plot of a more
tightly-woven novel. If anything, it could use a little more
development, as the climax ends up feeling a bit abrupt.</p>
<p>The themes are perhaps another reason why it’s not considered hard sci
fi even though it probably should be. Rather than an old man trying
to evangelize libertarianism or some other weird form of conservatism
(looking at you, Heinlein, though others are guilty) or explain how
humanity will become a transcendant orgy hive mind (shockingly many Arthur
C. Clarke books, and also Heinlein), Becky Chambers explores the meaning
of family … and what to do with family members or colleagues who are
obnoxious but indispensable. It explores issues like medical consent,
or when is it okay to ally with another group, as opposed to when that
is too risky.</p>
<p>All in all, I think it’s a good thing to have more diversity in science
fiction than what I grew up with, namely a weirdly specific flavor
of stodgy conservative white men – a flavor, to be clear, even more
specific than that combination of adjectives alone would imply. It’s
good to have that diversity even if some of that perspective confuses
reviewers and marketers about what genre a book is in. I as my current
self greatly enjoyed it, and I think my teenage self would have too,
and there’s something uplifting about that concurrence.</p>
Debt Ceiling, Reduxhttps://www.thecodedmessage.com/posts/debt-ceiling-2/2023-05-26T00:00:00+00:00So you might or might not be aware about the debt ceiling argument currently taking place in the US.
I’ve already written about this, but President Biden for some reason didn’t listen to me (perhaps because he doesn’t read my blog – which is disappointing). Other, more famous people have written about it too,, but the President insists on pretending he has to make a deal with the Republicans.
So, to catch everyone up, here’s how this all works.<p>So you might or might not be aware about the debt ceiling argument
currently taking place in the US.</p>
<p>I’ve already <a href="https://www.thecodedmessage.com/posts/debt-ceiling/">written about this</a>,
but President Biden for some reason didn’t listen to me (perhaps
because he doesn’t read my blog – which is disappointing). Other, <a href="https://www.peoplespolicyproject.org/2023/05/23/the-debt-limit-situation/">more
famous</a>
people have <a href="https://www.employamerica.org/blog/14th-amendment-debt-ceiling-perpetual-bonds-the-treasurys-political-misjudgments-are-hiding-in-technocratic-failure/">written about it
too,</a>,
but the President insists on pretending he has to make a deal with
the Republicans.</p>
<p>So, to catch everyone up, here’s how this all works.</p>
<p>Congress passes a budget every year. This budget requires the President
(and the bureaucracy he presides over) to spend money on various
programs. Not “allows,” to be clear, but <em>requires</em> – in the vast majority
of situations, the President does not have the discretion to not spend
the money. The spending is required by law.</p>
<p>The President (or specifically, the Treasury Department)
is also required to pay the US debt as it comes due. This
is also required by law, and specifically, this obligation
is enshrined constitutionally in the <a href="https://constitution.congress.gov/constitution/amendment-14/#amendment-14-section-4">14th amendment, section
4</a>,
which was added to the US Constitution shortly after our Civil War.</p>
<p>That money to do these things typically will come from two places: taxes,
and debt auctions. The IRS can only collect so many taxes. But the
treasury can do debt auctions without any practical limitations.</p>
<p>However, we are now in a situation where the treasury is hitting
a boundary where they supposedly cannot do debt auctions
anymore, because there is also a law that purports to limit
the total outstanding amount of debt the US has: the <a href="https://en.wikipedia.org/wiki/United_States_debt_ceiling">debt
ceiling</a>.</p>
<p>Now, I’m tempted to clarify that government debt is not like household
debt, that it’s not really very much like a debt at all. There is no
reason economically to have a debt ceiling. Too much government spending
or not enough taxes can in some situations cause inflation, but it’s
more complicated than any measure of how much debt there is.</p>
<p>But that’s actually beside the point here, because we can get to the
same conclusion without it, which is that the laws as they stand force
the administration into a contradictory position. The President –
or rather, the Treasury – has legal obligation to spend money on the
budget and on paying the debt. The Treasury cannot collect more in taxes
than it currently is, not without changing the laws. So it has a legal
obligation to issue more debt to pay the current debt, from the budget
and constitution. And from the debt ceiling, it has a legal obligation
not to issue more debt.</p>
<p>The Treasury legally must issue more debt. The Treasury legally must not
issue more debt. The two laws contradict. Which wins?</p>
<p>Well, if there is any legal option to make the contradiction go
away, the Treasury is bound to try them. These take the form of
issuing debt that, for whatever reason, doesn’t count toward the debt
ceiling. For example, the Treasury could mint a <a href="https://www.theverge.com/2023/5/23/23734654/government-debt-default-trillion-dollar-platinum-coin">$1,000,000,000,000 platinum
coin</a>.
Coins are technically a form of debt, owed by the US Treasury to whoever
holds the coin, but (in what is apparently an oversight) they do not
count towards the debt ceiling. It’s an absurd loophole, but legally,
the situation the Treasury is in otherwise is as absurd.</p>
<p>Or, less meme-like but equally useful, they could issue <a href="https://wealtheconomics.substack.com/p/the-simple-slam-dunk-scotus-proof">$0 face value
bonds</a>,
which again would not count toward the debt ceiling.</p>
<p>To be clear, I’m not saying the Treasury could consider doing these
things. I’m saying that the Treasury must do these things if the
alternative is default, that it is legally and even constitutionally
obligated to. I’m saying that if President Biden doesn’t choose one of those
things, he is in violation of his oath of office, and not doing his job.</p>
<p>President Biden, however, and his treasury secretary, have ruled out
doing those things. One hopes that they are lying rather than planning
on betraying their country and its Constitution. They say that there
is not time (this seems false), or specifically time to get it past
the courts – which is not how I think courts work. How courts work
is you do it, and then maybe courts yell at you. But given that
President Biden is obligated to do one of these things, it seems
pretty clear to me that doing it is better than not doing it.</p>
<p>The Biden administration argues that these things are unprecedented.
But: You know what else is unprecedented, but actually clearly illegal?
Default. It seems that President Biden would prefer if some deal is
reached, but if a deal is not, he should be willing to do the
legal unprecedented thing instead of the illegal unprecedented thing.</p>
<p>But let’s imagine he’s right. Imagine that these special debts for some
reason were off the table. Then we’re back to the contradiction.
The Treasury is obligated (by the budget and by the constitution) to
create more debt, and forbidden (by the debt ceiling) to create more
debt.</p>
<p>Certainly, the Constitution trumps the debt ceiling, at least to the
extent that government spending is to pay existing debt. Perhaps
it is only lawful to do so for that reason, and not to pay
normal budgetary outlays, and we should do a normal government
shut-down. That is what the <a href="https://www.wsj.com/articles/what-the-14th-amendment-really-says-debt-ceiling-biden-shut-down-default-constitution-78b24824">Wall Street Journal’s editorial board
argues</a>.</p>
<p>But given that the budget is not generally seen to be discretionary, and
the debt ceiling will be violated either way, this doesn’t hold water
for me. Once you’re past the debt ceiling, in my mind, you’re past it.
Once the debt ceiling gets in the way of the treasury paying existing
debt, I think it’s entirely blasted away as unconstitutional. But even as a
matter of statutory interpretation, the budget was passed more recently
than the debt ceiling, is more specific, and doesn’t contain an exception
accommodating it.</p>
<p>This is the argument the President and media refer to as “invoking the
14th amendment.” They all talk about it, again, as if it’s something
the President may do, rather than what the President <em>must</em> do, what
he is legally and morally obligated to do as part of the President’s
Constitutional duties to the American people, under his oath of office.</p>
<p>Instead, for whatever reason, the President and the media are all
talking as if the only way to avoid default is with a deal. They’re
talking as if the debt ceiling actually did what the Republicans claim
it does, that it causes a default when the ceiling is reached.</p>
<p>In my mind, this is already a violation of the President’s oath of
office. This is a concession to the powerful forces, like the former
President Donald Trump, that would like nothing more than to see
our constitutional order destroyed, who tried to destroy it only a
few short years ago. President Trump has said that he wants default,
an unconstitutional outcome, and by conceding that it is even legally
possible for that to happen, President Biden has failed to defend the
constitution against that manifest insurrectionist.</p>
<p>And the media who talk about the 14th amendment as an unprecedented option,
as if it is more tenuous than default, which is also unprecedented, are
also abetting enemies of the constitution, hopefully merely out of
confusion.</p>
<p>I respect Ezra Klein, but I disagree <a href="https://www.nytimes.com/2023/05/21/opinion/biden-mccarthy-debt-ceiling.html">with his
article</a>.
The Supreme Court may do what it wants, and they might be majority
Republican appointees, but they’re not majority Trumpers.</p>
<p>But more importantly, default is even less of a debt ceiling plan.
If no deal is reached, President Biden has no alternative. The Supreme
Court may later decide to ruin the country, but that doesn’t mean the
President can’t run it correctly in the meantime.</p>
<p>Just because it’s called “default” doesn’t mean we’re allowed to
do it if the other options are bad, by default. It doesn’t mean that
at all.</p>
<p>Of course, this might all be a negotiating stance. What all these people
might actually be saying is that a deal with Republicans is better than
a situation where one of these “untested” option is used. I hope – I
really truly hope – that they are more or less lying, that if it comes
to it, they’ll be in favor of minting the coin or ignoring the debt
ceiling rather than actually letting a default happen. But it upsets
me, disgusts me, that we’re making a deal with people who do not care
about fiscal policy on the basis of fear of an outcome that is legally
impermissible because of the threat of an unconstitutional law.</p>
<p>But if so, President Biden should not lie. If the Treasury is secretly
studying other options, they should instead do it openly. He should have
started all of this by asking for a law clarifying the debt ceiling
is unconstitutional and being repealed, but shrug his shoulders if
he doesn’t get one. The opposition in this case does not deserve
the respect they are currently being given, led by their pro-default
insurrectionist-in-chief, for whom the debt ceiling was raised <a href="https://www.snopes.com/fact-check/gop-debt-ceiling-trump-presidency/">three
times</a>.</p>
<p>And if you believe, genuinely, that the US needs to have a conversation
about fiscal policy, the time to have that conversation is when passing
the budget, not randomly according to an unconstitutional debt ceiling.</p>
There is No One True Best Programming Language (but some are still better than others)https://www.thecodedmessage.com/posts/best-programming-language/2023-05-24T00:00:00+00:00I am no stranger to programming language controversy. I have a whole category on my blog dedicated to explaining why Rust is better than C++, and I’ve taken the extra step of organizing it into an MDBook for everyone’s convenience. Most of them have been argued about on Reddit, and a few even on Hacker News. Every single one of them have been subject to critique, and in the process, I’ve been exposed to every corner, every trope and tone and approach of programming language debate religious war, from the polite and well-considered to the tiresome and repetitive all the way to the rude and non-sensical.<p>I am no stranger to programming language controversy. I have a <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">whole
category</a> on my blog dedicated to explaining why Rust
is better than C++, and I’ve taken the extra step of organizing it into
an <a href="https://www.thecodedmessage.com/rust-c-book/">MDBook</a> for everyone’s convenience. Most of them have
been argued about on Reddit, and a few even on Hacker News. Every single
one of them have been subject to critique, and in the process, I’ve been
exposed to every corner, every trope and tone and approach of programming
language <del>debate</del> religious war, from the polite and well-considered
to the tiresome and repetitive all the way to the rude and non-sensical.</p>
<p>There are two tropes in particular that many times have been proferred to
me (or rather, levered at me) about programming languages, two opposite
errors that I would like to critique. I would say that I’d like to nip
them in the bud, or respond to them once and for all, but I know the
power of my blog is limited, so instead I’d just like to give my opinion
on them, and explain why they are erroneous. Here are the errors:</p>
<ol>
<li>There is one best programming language.</li>
<li>Every programming language has its place.</li>
</ol>
<h1 id="error-1-there-is-one-best-programming-language">Error #1: There is one best programming language</h1>
<p>Some languages have fans in the original sense of <em>fanatic</em>. Some
languages inspire a level of devotion in programmers where they forswear
other programming languages with an almost religious loyalty. These
fanatics truly believe that the programming language is perfect, and that
no other language can so perfectly capture the structure of computing
and of algorithmic reasoning – or even be acceptable in light of the
existence of a perfect programming language.</p>
<p>Any threat to this programming monolatry is then attacked as intrinsically
irrational. After all, if everyone would just do the basic and obvious
step of rewriting everything in this ideal programming language, then
all bugs would be fixed. Then, “the wolf also shall dwell with the lamb,
and the leopard shall lie down with the kid,” everyone will be immortal,
and the messiah will come… And this, of course, is insufferable to
normal people, who realize that programming languages are tools, not gods.</p>
<p>Rust, admittedly, brings this out in people. So does Lisp, and so
does Haskell. And lest you think I’m exaggerating with the religious
references, someone even wrote a Haskell book entitled <a href="https://cosmius.bitbucket.io/tkhe/"><em>To Kata Haskellen
Evangelion</em></a>, Biblical Greek for the
blasphemous and hopefully tongue-in-cheek title <em>The Gospel according
to Haskell</em>.</p>
<p>I know what you’re thinking; I can hear it in my head. You’re thinking:
“You’re one to talk, Jimmy! <a href="https://www.thecodedmessage.com/"><em>The Coded Message</em></a> is a Rust blog, and
worse, a Rust evangelism blog! How dare you criticize when you’re one
of the worst offenders?”</p>
<p>Nevertheless, in spite of what you might think, I don’t think Rust is
the one true programming language. I think it’s <a href="https://www.thecodedmessage.com/posts/paradigm-shift/">ahead of other mainstream
programming languages</a> in terms of strong typing
and functional features (key word “mainstream”), and I personally enjoy
working on it full time, all true. But while I am a fan, I don’t think it’s
perfect, or even unique in most of the ways it’s good.</p>
<p>Instead, I bring up this error for the reason I promised: Because
it has been levelled against me. Early on, with <a href="https://www.thecodedmessage.com/posts/hello-rust/">my first Rust
post</a>, I wrote this statement (and see if you
can see why it was controversial):</p>
<blockquote>
<p>If you are a systems programmer, if you are used to C and C++ and to
trying to solve systems programming types of problems, Rust is magical,
just like when you learned your previous favorite programming language.</p>
<p>If you are not, Rust is overkill for your task at hand and you shouldn’t
be using it. I earnestly recommend Haskell.</p>
</blockquote>
<p>This got me quite a bit of anger on Reddit. One commenter was furious that
I recommended Haskell, because they had tried to learn it in the past
and had a bad time. Another tried to tell me I was being stubborn because
the collected testimony of the Rust Reddit hadn’t somehow managed to override
my 18 years of professional programming experience and convince me that
garbage collection was not a necessary thing to have in a programming language
sometimes.</p>
<p>And the key term there is Rust Reddit: There are some people there who
think everyone should be writing Rust, even people who have every reason
to benefit from a garbage collector and who have nothing to gain from
the strictness of a borrow checker, because they think Rust is just the
absolute best possible language. And the Rust sub-reddit does what any
good echo chamber does, and brings out that vibe in every Rustacean.</p>
<p>But the echo chamber did not get me. Although I’ve moderated my opinion
some – I’ve realized that there are some times where Rust beats out GC’d
languages for applications outside of my narrow definition of systems
programming, if only because it is both so mainstream and so successful
at bringing in modern FP features – I still hold by my fundamental
point:</p>
<blockquote>
<p>Sometimes, indeed probably for most programming projects, Rust
is the wrong choice. Just like I wouldn’t use Excel to do systems programming,
I wouldn’t use Rust to keep track of splitting expenses on a trip.</p>
</blockquote>
<p>Even for “serious” programming projects (whatever that means), sometimes,
you simply do need a garbage collector. Sometimes, the semantics of
Rust are too deep-cut or complicated to teach to the people you need
to do your programming.</p>
<p>Heck, sometimes even existing infrastructure or existing legacy codebases
or just existing skillsets are more important than what programming
language features you have. Sometimes, Rust would take a re-write.
And re-writing in Rust is not a panacea, or even always
a good idea.</p>
<h1 id="error-2-every-programming-language-has-its-place">Error #2: Every programming language has its place</h1>
<p>This one of course gets levelled against me far more often, especially in
my Rust vs C++ debates. Most people realize programming languages
are technical tools, and a skilled programmer can pick new ones up with
relative ease. But some people act and talk instead as if, say, C++ programmers
were an ethnic or religious group. If I call for the gradual deprecation
and obsolescence of C++ in favor of Rust – while understanding that
legacy code is a genuine concern that will be with us for decades –
these people act as if I’m calling for crimes against humanity, saying
<a href="https://en.wiktionary.org/wiki/sing_Kumbaya">Kumbaya</a>-reminiscent
statements like “All programming languages have their place.”</p>
<p>But of course, some tools are simply obsoleted by other tools. While Rust
won’t serve your needs if what you really need is garbage collection,
there are very few scenarios where C++ still beats Rust for new
development. Sure, C++ has improved over time, but Rust doesn’t have a
legacy to weigh it down, and so can actually do things <a href="https://www.thecodedmessage.com/posts/cpp-move/">right the first
time</a>.</p>
<p>Some people disagree with this in a way I respect, because of support for
optimizing compilers, or the vagueness and immaturity of the semantics
of unsafe Rust, or some other concrete reason where C++ has something to
offer as a tool. Others simply live in worlds where too much code is in
C++, and it would be impossible to migrate anytime soon, and that also
makes sense to me. But I simply cannot take seriously an assertion that
in some axiomatic way, reminiscent of the intrinsic value of all human
beings, every programming language has its value.</p>
<p>Why should this be true? It’s not like which programming language someone
uses is an intrinsic quality. I’ve changed from a C++ programmer to a
Rust programmer, and so can you. Perhaps some of the people saying this
are hobbyist programmers, asserting the right of people to enjoy C++
personally, and to program it as nerds. And that’s fair! But that’s
also not what I’m talking about. I’m talking about what the best
programming language is to use for projects that people will use in
anger, where it matters whether a language is likely to <a href="https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/0/CSI_SOFTWARE_MEMORY_SAFETY.PDF">lead to security
vulnerabilities</a>
when used. If what programming language you use for such projects is
a key part of your identity, then that’s not an OK way to structure
your identity.</p>
<p>If it were true that all languages had their place and their value, does
that mean that there should be shops writing in the obsolete versions of
C++, like the original C with Classes? Does that mean that there should be
shops writing code in <a href="https://en.wikipedia.org/wiki/INTERCAL">INTERCAL</a>?
Does that mean that there’s some situations in which it’s best to do
greenfield development in COBOL?</p>
<p>One example of this trope is the famous essay <a href="https://meyerweb.com/eric/comment/chech.html">"‘Considered Harmful’ Essays
Considered Harmful"</a>,
which has of course been cited to criticize my own
<a href="https://www.thecodedmessage.com/posts/cpp-move/">“Considered Harmful” post</a> (for
more on the “Considered Harmful” trope, see the <a href="https://en.wikipedia.org/wiki/Considered_harmful">Wikipedia
article</a>). Ironically
but unsurprisingly, “‘Considered Harmful’ Essays Considered Harmful” is
dogmatic in exactly the way it criticizes, in spite of giving itself a
(silly and ill-defended) out. In spite of recommending that “considered
harmful” essays be replaced by “benefits and weaknesses” lists,
or even “perceived benefits and weaknesses” lists, it does not follow
its own advice. It does not list benefits of the “Considered Harmful”
essays it considers harmful.</p>
<p>So I will fill in this deficit. “Considered Harmful” essays are good
when a feature of a tool does indeed cause harm, and a better option is
available – as is often actually the case. The title is a cliché, which
is a good thing in this case: it signals to the reader, in a light-hearted
way, what the thesis of the document is – as opposed to “benefits and
weaknesses” lists which tend to be biased in any case and can amount
to passive-aggressiveness. Weaknesses in one’s argument or benefits in
one’s opponents argument can and should be acknowledged and addressed,
but that doesn’t mean you have to pretend not to have a position.
Just because something has some benefit doesn’t mean that it can’t, overall,
be fairly considered harmful.</p>
<p>Indeed, my own post did do some “benefits and weaknesses,” in spite of
being titled as a “Considered Harmful” essay. It did spend some time
explaining why C++ made the decisions they did, and what the benefits
of C++’s decisions were, even in the context of a post about why these
decisions were considered harmful. C++ had to implement non-destructive
moves for backwards-compatibility. They had boxed themselves into them,
harmful as they are. That doesn’t make them any less harmful, but it
does make them understandable.</p>
<p>So I disagree with the people who have used that post to criticize me,
and ask them why they don’t also turn the arguments of that post against
itself. Perhaps I could write:</p>
<blockquote>
<p>‘“Considered Harmful” Essays Considered Harmful’ Considered Harmful</p>
</blockquote>
<p>The only problem with this would be how to punctuate it. That and,
I’m sure it would widely be considered… quite silly.</p>
<h1 id="conclusion-restatement-and-summary">Conclusion: Restatement and Summary</h1>
<p>Programming languages are tools. They are important tools, so it’s good
to make sure they are of high quality, and do the things we demand of
them, because they are often asked to do critical tasks for society. They
are also not to be conflated with the people using the tools, who can
retrain on new tools if they’re worth their salt.</p>
<p>Tools should not be idolized, and tools cannot be perfect. It is
impossible to make a tool that can serve any purpose equally well
– programming language design, in particular, will always have
trade-offs. However, it is possible to make a tool that loses to another
tool in all categories, and that is what C++ will soon be in comparison
to Rust, if it is not already there.</p>
<p>And C++ programmers have their place in the new Rust world – it’s
very easy to learn Rust from a C++ background. And C++ history has
its place there too – Rust builds on C++, and it wouldn’t have been
possible without the contributions of those who worked on making C++
what it is. Everything that is community about C++, everything that is
people, everything that has moral value, can be migrated to Rust.</p>
<p>But that doesn’t mean that C++, the tool, has a place in production
programming beyond legacy (i.e. pre-existing) projects. Again, there still
may be a few other valid reasons to favor C++ over Rust (though they’re
getting fewer and weaker with time), but a bald assertion that “every
language has its place” is not one of them.</p>
x86S: A Long Time Cominghttps://www.thecodedmessage.com/posts/x86s/2023-05-23T00:00:00+00:00Intel has just released a new white paper, where they discuss removing a lot of the legacy cruft of the Intel/AMD architecture they call Intel64. Only 64-bit operating systems – and a narrow set of 32-bit legacy apps that don’t use segmentation (a small subset in theory but basically all of them in practice) – will be supported. I am surprised at how excited I am, although after all this time perhaps the better word is “relieved.<p>Intel has just released a new <a href="https://www.intel.com/content/www/us/en/developer/articles/technical/envisioning-future-simplified-architecture.html">white
paper</a>,
where they discuss removing a lot of the legacy cruft of the Intel/AMD
architecture they call Intel64. Only 64-bit operating systems – and
a narrow set of 32-bit legacy apps that don’t use segmentation (a
small subset in theory but basically all of them in practice) – will
be supported. I am surprised at how excited I am, although after all
this time perhaps the better word is “relieved.”</p>
<p>Finally, Intel computers will dispense with the illusion that the default
mode is the DOS-compatible, 16-bit “real mode.” They will drop the conceit
that modern memory protection, not to mention the ability to address more
than 1MB of memory (approximately, yes I know about A20), is opt-in –
which it currently, literally, is. All of the code to accommodate these
legacy modes can be phased out. All of the circuitry and/or microcode
to implement all of these legacy modes can be removed – though I’m sure
Intel has had ways to keep it from doing too much damage, it definitely
increased the complexity of their processors.</p>
<p>This is one of the biggest tech debt paydowns I’ve seen in a
long time. I have long felt about Intel architecture somewhat
analogously to how Richard P. Gabriel, author of <a href="https://www.dreamsongs.com/RiseOfWorseIsBetter.html">“The Rise of Worse
is Better”</a>, felt
about C++ and Unix decades ago:</p>
<blockquote>
<p>The good news is that in 1995 we will have a good operating system
and programming language; the bad news is that they will be Unix and C++.</p>
</blockquote>
<p>Similarly, I have always felt that Intel architecture would become
reasonable someday, that it would gradually convert itself to something
less absurd than its traditional state. I was excited when AMD (not
Intel, note) came out with what Intel now calls Intel64, getting rid
of segmentation in 64-bit mode and adding 8 sorely needed additional
general-purpose registers (for a total of 16).</p>
<p>Now, finally, they’re phasing out the legacy modes. No more DOS on a
modern PC (and it wouldn’t work anyway for other reasons). Good!</p>
<p>Like many tech debt paydowns of this magnitude and this level of
historical relevance, it’s about the cognitive burden as much as it’s
about the actual implementation or the actual code and circuitry to
work around the complexity. We can now, slowly but surely, forget
the arcane details of how things used to be.</p>
<p>It brings me a tinge of nostalgia, actually. 16-bit DOS programming
was where I first learned assembly, at least to read it. Segmentation
and the different processor modes was firmly in my awareness when I
used a DOS computer with Windows 3.1 as a child. I remember playing
with the edge cases, like “unreal mode” which was like real mode
but where each segment could be addressed with 32-bit registers.
Knowing the complexity of Intel architecture was relevant, and
part of how I learned computer architecture in general.</p>
<p>But more recently, all of this knowledge has seemed overpresent.
Too many times I’ve seen people assume Intel architecture and bring
these old irrelevancies of PCs into conversation and even formal
talks, assuming familiarity with not just operating system and systems
concepts but the Intel-specific details of them. They’ll be talking
about registers and you’ll see that instead of generic names like <code>r3</code>,
<code>r4</code>, they’re talking about specific Intel registers. Or they’ll mention
<code>cr3</code> instead of generically saying “page table base register,” or
“the <code>syscall</code> instruction” or even the obsolescent 32-bit <code>int 0x80</code>
instead of saying “issuing a syscall through a trap.”</p>
<p>The biggest example is how often I hear people talking about “ring 0”
and “ring 3” when they should be saying “kernel mode” and “user mode.”
The numbered rings are so jarringly and gratuitously Intel-specific. It makes
me wonder if they genuinely think all processor architectures number
protection rings or privilege levels like that (they do not), or if they
think the intermediate rings between 0 and 3 are still relevant to modern
OS design on Intel (they are not). Or perhaps they’re just okay with assuming
Intel, ignoring the mobile and embedded worlds, and also bringing in an
irrelevant, overengineered concept while they’re at it.</p>
<p>Maybe this will stop now that Intel is eliminating the unused rings 1
and 2. Maybe people will stop occasionally talking as if protected mode
was an exceptional mode, now that it won’t be a mode at all, but the only
way the processor runs.</p>
Voice is Hardhttps://www.thecodedmessage.com/posts/voice/2023-05-22T00:00:00+00:00I was reading my ADHD blog post today, considering whether to send it to a friend, and it was surprisingly hard for me to bring myself to. I realized I was embarrassed at the voice, the phrasing, the lack of beauty in the individual words, all of which is something I paid relatively little attention to before – and which my friend, who also writes, will definitely notice.
It’s something I’ve paid less attention to than I should.<p>I was reading my <a href="https://www.thecodedmessage.com/posts/adhd-philosophy/">ADHD blog post</a> today,
considering whether to send it to a friend, and it was surprisingly hard
for me to bring myself to. I realized I was embarrassed at the voice,
the phrasing, the lack of beauty in the individual words, all of which
is something I paid relatively little attention to before – and which
my friend, who also writes, will definitely notice.</p>
<p>It’s something I’ve paid less attention to than I should. “Writing is
thinking” is my philosophy, and I have tons of thoughts that I know
other people are interested in. Shouldn’t the structure of the thoughts,
both the logical structure and the order in which they’re presented,
be more important than voice? And I still believe they are – and yet
voice does still matter.</p>
<p>I know this deficit has frustrated many of my writing friends. They
know that when it comes to it, I can produce English with good voice,
with solid and compelling rhetoric even. They know that because when I
talk, especially when I’m passionate about a topic, or excited about the
conversation, it comes out much smoother than when I write, much more
poetic. How can this be, when I have more time to plan writing? How
have I not leveraged my deeply cultivated conversational skills to be
a better writer? How do these fluent spoken conversations and stilted
written phrasings exist in the same person?</p>
<p>Perhaps it’s because when I’m writing, I’m using all of this extra
planning ability for something besides voice, perhaps even contradictory
to it. But am I even achieving … whatever this other thing is? Or am
I simply overbaking all my statements with my focus on clarity and good
structure, with no upside, letting my sentences sit for too long under
my skeptical eye until they’re purified not only of any confusion and
complexity but also any character?</p>
<p>Whatever the problem, my speaking remains unaffected. I could leverage
this. There are so many YouTubers, so many podcasters… I might be
better off doing one of those things, leaning on my natural (or at least
far better cultivated) conversational tone, or my natural instructional,
professorial voice, rather than my artificial, too in-the-head, downright
overbaked writing style.</p>
<p>Or maybe I should just bite the bullet and
do what multiple people have now recommended: the
<a href="https://www.adhdessentials.com/essentials/the-wall-of-awful/">awfully</a>
intimidating and high-executive-function task of figuring out a sound
recording workflow, where I speak what I want to write, and then listen
and type it up. This might just work, as long as I’m okay with more
organizational complexity, more phases of work in between outline and
draft, and even more steps before a project can finally be deemed finished
– if such a state is ever possible.</p>
<p>In any case, I’m committed to writing. I’m committed to continuing this
blog, writing fiction, and finishing other on-going (secret!) writing
projects. Ideally, I wouldn’t have to talk first and then write. Ideally,
I could just sit down and words would flow through my fingers as naturally
and as artfully as they come out of my mouth.</p>
<p>I don’t have an easy solution. I’ll continue to pay attention, both to
my voice and other people’s, and read advice – and I even plan on doing
the voice-recording thing at some point, if only just to try it. But
most importantly of all, I think I just need a ton of more practice,
more low-stakes writing, and simply much more raw volume.</p>
<p>To this end, I’ve decided to commit to writing at least one writing prompt from
<em>300 Writing Prompts</em> daily, a journaling prompt book given to me a while
ago by a dear friend. When I first got this book, I was confused, because
the prompts in there don’t seem designed to lead to stories or writing ideas.
But for raw writing practice, they’ll be perfect. And perhaps
by communicating in yet another medium – the hand-written word, instead of
the spoken or the typed – I will be able to develop more fluidity in
all of the media through which words can be delivered.</p>
A New Garden: Rust vs C++ mdbookhttps://www.thecodedmessage.com/posts/rust-v-c++-mdbook/2023-04-24T00:00:00+00:00Here it is, the Rust vs C++ mdbook.
I’ve wanted for a while to re-organize some of the content on my blog into gardens. I got the idea from the blog post “The Garden and the Stream: A Technopastoral”. Basically, some content is ill-suited to date-based, time-organized systems like blogs. In fact, most of my content remains valid over a long period of time, rather than participating in conversation (with some exceptions), but rapidly becomes less discoverable after I’ve written it, as it is buried by newer posts.<p>Here it is, <a href="https://www.thecodedmessage.com/rust-c-book/">the Rust vs C++ mdbook</a>.</p>
<p>I’ve wanted for a while to re-organize some of the
content on my blog into <a href="https://www.thecodedmessage.com/gardens/">gardens</a>. I got
the idea from the blog post <a href="https://hapgood.us/2015/10/17/the-garden-and-the-stream-a-technopastoral/">“The Garden and the Stream: A
Technopastoral”</a>.
Basically, some content is ill-suited to date-based, time-organized
systems like blogs. In fact, most of my content remains valid over a
long period of time, rather than participating in conversation (with
<a href="https://www.thecodedmessage.com/posts/stroustrup-response/">some exceptions</a>), but
rapidly becomes less discoverable after I’ve written it, as it is
buried by newer posts.</p>
<p>If I want to have content that is useful in a long-term fashion, the
blog is not the ideal structure. While you can always scroll down,
or look through <a href="https://www.thecodedmessage.com/tags/">tags</a>, a more refined system would be
to store information in gradually evolving, more comprehensive
documents, that are gradually augmented or refined over time,
that is to say, a <a href="https://www.thecodedmessage.com/gardens/">garden</a>.</p>
<p>The <a href="https://www.thecodedmessage.com/about/">About Me</a> page on a blog is one example of this, but
my blog series about <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">Rust vs. C++</a> seemed like
another one where I had a lot of material that could be better structured
and more coherently presented in a single, hierarchical document.</p>
<p>So I’ve posted it as an <code>mdbook</code>, <a href="https://www.thecodedmessage.com/rust-c-book/">here</a>. I don’t like
to think of this as a “book” in a form that would ever be published on
paper – it’s not long enough, interesting enough, or complete enough for
that. That would also go away from the garden aesthetic, where it is a
continuous work-in-progress that is always evolving. But I do think the
<code>mdbook</code> format is better suited to the material than my existing blog
series, for long-term access.</p>
<p>I haven’t incorporated all the material from my blog series yet, as some
of the older material I think could stand a re-write. It is
maintained in the open on <a href="https://github.com/jhartzell42/rust-c-book/">GitHub</a>,
so feel free to give feedback there in terms of issues and even merge
requests. It’s released under the CC license for non-commercial,
attributed, share-alike use, with <a href="https://github.com/jhartzell42/rust-c-book/blob/main/LICENSE">this license file</a>.</p>
<p>While I will continue to try and integrate existing material into this
garden, and expand on it when I am inspired to do so, I plan on not focusing
on Rust vs. C++ going forward. If there are any substantial additions,
however, I will update you on this blog.</p>
<p>Thank you for reading! More, different Rust content is coming soon!</p>
This Little Piggy Did Crimehttps://www.thecodedmessage.com/posts/little-piggy/2023-04-22T00:00:00+00:00This little piggy went to market This little piggy stayed home This little piggy had roast beef… This little piggy had pork This little piggy arrested the last two little piggies …For their heinous crimes against barnyard solidarity …And put them in little piggy jail<p>This little piggy went to market <br>
This little piggy stayed home <br>
This little piggy had roast beef… <br>
<em>This</em> little piggy had <strong>pork</strong> <br>
This little piggy arrested the last two little piggies <br>
…For their heinous crimes against barnyard solidarity <br>
…And put them in little piggy jail</p>
Rust: A New Attempt at C++'s Main Goalhttps://www.thecodedmessage.com/posts/rust-new-cpp/2023-04-06T00:00:00+00:00I know I set the goal for myself of doing less polemics and more education, but here I return for another Rust vs C++ post. I did say I doubted I would be able to get fully away from polemics, however, and I genuinely think this post will help contextualize the general Rust vs. C++ debate and contribute to the conversation. Besides, most of the outlining and thinking for this post – which is the majority of the work of writing – was already done when I set that goal.<p>I know I set the <a href="https://www.thecodedmessage.com/posts/review-2022">goal</a> for myself of doing less
polemics and more education, but here I return for another <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">Rust vs
C++</a> post. I did say I doubted I would be able to
get fully away from polemics, however, and I genuinely think this post
will help contextualize the general Rust vs. C++ debate and contribute
to the conversation. Besides, most of the outlining and thinking for
this post – which is the majority of the work of writing – was already
done when I set that goal. It also serves as a bit of conceptual glue,
structuring and contextualizing many of my existing posts. So please
bear with me as I say more on the topic of Rust and C++.</p>
<p>Rust is a polarizing programming language, because of how radical it
is. It has <a href="https://www.thecodedmessage.com/posts/paradigm-shift">gone the furthest</a> in introducing
features from functional programming languages into the mainstream world,
and ignoring long-held programming language design principles from the
realm of <a href="https://www.thecodedmessage.com/tags/beyond-oop/">object-oriented programming</a>. Its fans can be
very enthusiastic, sometimes off-puttingly so, stereotypically demanding
that all software be rewritten in Rust even when completely unfeasible –
a stereotype that is mostly untrue, but whose existence and occasional
true examples shows the intensity of the debate. But a lot of Rust’s
criticism comes specifically from C++ programmers, and correspondingly
a lot of Rustaceans’ criticisms of other programming languages is
directed specifically at C++, including <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">mine</a>. Even
the creator of C++, while not mentioning it by name, <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2739r0.pdf">entered the
fray</a>
(and along with other Rustaceans, <a href="https://www.thecodedmessage.com/posts/stroustrup-response/">I
responded</a>).</p>
<p>There’s a good reason for this particular rivalry. While usable in other
domains, Rust is strongest where C++ has hitherto been unopposed: as a
high-level systems programming language. Many of Rust’s greatest strengths
are <a href="https://www.thecodedmessage.com/posts/raii/">directly based off of ideas originated in C++</a>. And
Rust has, in many ways, the same goals that C++ has. It can be argued –
and in this post I shall argue – that Rust has the exact same overall
goal that C++ does, albeit with a different interpretation of
how that goal is best accomplished.</p>
<h1 id="zero-cost-abstractions">Zero-Cost Abstractions</h1>
<p>C++ has an explicit goal of providing zero-cost abstractions.</p>
<p>This is a bit of a confusing term of art and has the potential
to be misleading, but it comes attached with explanations that
clarify it some. It is also referred to as the “zero-overhead
principle,” which Dr. Bjarne Stroustrup, father of C++,
<a href="https://www.stroustrup.com/ETAPS-corrected-draft.pdf">explains</a> (see
pg. 4) describes as containing two components:</p>
<ul>
<li>What you don’t use, you don’t pay for (and Dr. Stroustrup means
“paying” in the sense of performance costs, e.g. in higher latency,
slower throughput, or higher memory usage)</li>
<li>What you do use, you couldn’t hand code any better</li>
</ul>
<p>There is also an <a href="https://en.cppreference.com/w/cpp/language/Zero-overhead_principle">executive summary of the
concept</a>
at <a href="https://en.cppreference.com/w/">CppReference.com</a>.</p>
<p>I, however, prefer the terminology of “zero-cost abstraction,”
confusing as it can be, because it embodies a hidden third principle,
that is unstated among those other two, and against which those other
two principles are balanced. The word “abstraction” is the key, and the
third principle is:</p>
<ul>
<li>You can still get the abstractive and expressive power you expect from
a modern programming language.</li>
</ul>
<p>This third principle is necessary to distinguish higher-level “zero cost”
languages like C++ and Rust from lower-cost languages like C.</p>
<p>To fully explain why I include this third principle, and to delve into
the history of the concept in general, I want to talk more about C.</p>
<h1 id="c-the-portable-assembly">C: The Portable Assembly</h1>
<p>C has often been described as a “portable assembly language.” Unlike
other high level programming languages before it (“high level” at
the time meaning anything higher level than raw assembly language),
it exposed users directly to gnarly machine-language abstractions like
pointers, and to common assembly-language capabilities like shifting
and bitwise operators.</p>
<p>The goal was to give the programmer something minimally distinct from
assembly language, where the programmer had almost as much control over
the computer as an assembly language programmer without sacrificing
portability. Few higher-level features have been added, even now:
there was no built-in string type, and only a limited array type that
exposed the underlying concept of pointers the instant you poked at it.
Structures are little more than a way of calculating offsets, and memory
management is done by explicitly invoking memory management routines.</p>
<p>C’s preference, in general, was to only add onto assembly those features
absolutely necessary for portability, and not to impose any other
structure on the programmer – or, said another way, not to provide
any other structure to the programmer.</p>
<p>This was far from an iron-clad rule. And there are definitely exceptions:
C, built into the programming language, prefers null-terminated strings
(also known as “C strings”) to arrangements that use specific lengths,
a substantial constraint on the programmer beyond assembly language and
probably a mistake overall.</p>
<p>More deeply, and probably less avoidably at the time, C assumes a
traditional call structure. Many techniques that can be used to
implement closures, co-routines, or other more radical alternatives
to a call stack are difficult to impossible to do with standard C –
while generally being possible in any assembly language.</p>
<p>But, with these exceptions, C generally does tend to only provide
one overarching abstraction, portability, and when it does, it
has the same zero-cost goals that C++ has, to only make the user
pay for the abstractions they actually use, and to provide abstractions
as efficiently as the equivalent hand-coded assembly.</p>
<p>Put another way, C++’s zero-cost overhead principle, as Dr. Stroustrup
defines it, is more or less inherited from C. Where C++ differs from
C is in the “abstraction” part of providing “zero-cost abstractions.”
Everything you can do in C++ you can do in (potentially tedious and
repetitive and error-prone) C, but C++ provides more abstractions,
beyond just what is necessary for portability.</p>
<h1 id="c-a-more-abstracted-c">C++: A More Abstracted C</h1>
<p>This gives us a framework for understanding the entire goal of C++,
and I would argue, of Rust. Once we understand that C++ is trying to
keep the zero-cost principle of C, where abstractions do not come
with a performance penalty (and where “zero” is a reference to the
difference between the performance cost and a manual assembly-language
implementation), but with the expressive and abstractive power of a
higher-level programming language, everything else about C++ makes sense.</p>
<p>C++ was originally christened “C with Classes,” and it tried to add
Object-Oriented Programming to C. All the mechanisms of OOP could be
portably added to C directly by an application or library developer
with judicious use of function pointers and structure nesting (and
<a href="https://en.wikipedia.org/wiki/GLib"><code>glib</code></a> is a famous example of a
library that does exactly that), but C++ built this abstraction
into the programming language itself.</p>
<p>Objective-C also did this (and according to Wikipedia it “first appeared”
one year sooner in 1984), but Objective-C has always felt like two
programming languages glued together. In Objective-C, the object-oriented
features do not inherit the zero-overhead principle from C – nor do they
look like C at all. They look instead like a Smalltalk dialect, where
switching between C and this odd Smalltalk dialect was permitted on an
expression-by-expression basis using an odd mix of square brackets and
<code>@</code>-signs.</p>
<p>In C++, the added abstractions, including OOP, take on more of
a resemblance to C, and importantly, continue to try to retain
C’s advantages in systems programming by making the new features
zero-overhead.</p>
<p>During much of the history of C++, OOP was considered to be the most
important abstraction that a programming language could offer. But
once it was added, it expanded the scope of C++ abstractions. Nowadays,
C++ is considered multi-paradigm, and provides not just OOP, but a
wide array of abstraction.</p>
<p>Nowadays, C++ tries to keep up with other programming languages in
what features it offers, to the extent that it can while being limited
by the zero-cost principle. This is in sharp contrast to C, which
continues to try to define existing features better and make them
more rigorous within the existing feature scope. The only features
C++ rejects out of hand are those that do not jive with zero-cost
abstraction, showing that in actuality C++’s defining trait is to
have the three-pronged concept of zero-cost abstraction that
I introduced above, two prongs about “zero cost” and one about
“abstraction”:</p>
<ul>
<li>What you don’t use, you don’t pay for</li>
<li>What you do use, you couldn’t hand code any better</li>
<li>We give you the power of abstraction expected for a programming language
of the day</li>
</ul>
<p>This is why garbage-collection is not offered in C++ (though it is still
possible to implement manually) – it cannot be offered in a zero-cost
way. However, C++’s alternative to garbage collection, namely
<a href="https://www.thecodedmessage.com/posts/raii/">RAII</a>, continues to become more effective as new features
like move semantics and <code>std::unique_ptr</code> were added, to the extent that
in modern C++, it would be unimaginable not to have those features,
and they have become essential to C++’s memory management model.</p>
<p>These three goals explain why C++ keeps accruing new features,
whereas C maintains the features it has. They explain why C++ had
to add templates – as a zero-cost alternative to OOP, or a zero-cost
way of implementing collections. They explain why C++ had to add move
semantics – because without it, RAII is a worse abstraction than GC.</p>
<h1 id="rust-a-c-redo">Rust: A C++ Redo</h1>
<p>Rust simply does a better job at achieving these goals, because Rust
gets to start from scratch, with the modern concept of what’s expected in
a high-level programming language, rather than working forwards through
time. And, in doing so, it avoids a lot of the mistakes that C++ made, and
can design a language that includes all of the modern features together.</p>
<p>A full set of OOP features is no longer ideologically required, so Rust
<a href="https://www.thecodedmessage.com/tags/beyond-oop/">doesn’t offer them</a>. Instead, safety has become
a <em>sine qua non</em>, so Rust offers that (with an opt-out provision).
One might argue that safety violates the zero-cost abstraction because
of bounds checking, but that’s simply not true as defined. You only
pay for bounds checks if you’re actually using the feature of safety
– unchecked unsafe accesses <a href="https://www.thecodedmessage.com/posts/unsafe/">are in fact available just an <code>unsafe</code> keyword
away</a> – and the feature of safety is implemented as
efficiently as one would by hand (by inserting bounds checks into array
accesses).</p>
<p>Similarly, C++ has learned that move semantics turn out to be essential
in an RAII/value-semantics model to avoid spurious copy-and-deletes and/or
indirections for e.g. storing <code>std::string</code>s in a <code>std::vector</code> that might
be resized. Before move semantics, C++ often forced violations of the
zero-cost abstraction principle by providing abstractions that would do
extraneous copies or required extra indirections to use effectively, which
is not what an assembly language programmer would ever write. However,
since C++ move semantics were bolted on after the fact, it does them in a
<a href="https://www.thecodedmessage.com/posts/cpp-move/">deeply confusing way</a>, where Rust gets to reset and
design itself for destructive moves from the get-go.</p>
<h1 id="a-note-on-the-raii-model">A Note on “the RAII Model”</h1>
<p>In my <a href="https://www.thecodedmessage.com/posts/raii/">RAII post</a> I referred to C++’s alternative
to garbage collection, centered on RAII, as the “RAII model,” and wrote
that <code>std::unique_ptr</code> and move semantics were essential to this model.
A Reddit comment later explained that I must be confused, because RAII
pre-dates those features.</p>
<p>They had misunderstood me, and I stand by my statements, but I think it
is worth some clarification. By “RAII model,” I mean RAII and other
features which, when combined, provide an alternative to garbage collection.
And the RAII model before C++11 did indeed lack features essential to
competing with garbage collection. It was simply a worse model then, and
much harder to use correctly in a complicated codebase.</p>
<p>In a similar way, I would say that in Rust, borrow checking and
destructive moves are essential to the RAII model, because without it,
the model is a much worse competitor to garbage collection. And yes,
that does imply that C++’s concept of RAII is fundamentally deficient
by not being paired with borrow checking, just like pre-C++11 RAII was
fundamentally deficient by not being paired with move semantics and
<code>std::unique_ptr</code>.</p>
<p>The alternative to garbage collection that C++ and Rust have built has
been a work in progress through most of its history. Rust had to
be a new programming language rather than an evolution for a number of
reasons, but fixing C++’s lack of borrow checking and <a href="https://www.thecodedmessage.com/posts/cpp-move/">weird move
semantics</a> were some of the most important such
reasons.</p>
<h1 id="backwards-compatibility">Backwards-Compatibility</h1>
<p>Of course, C++ does have goals that Rust drops – and in doing so,
it can do better at this core goal. The biggest such goal is perhaps
also a trivial example: C++ has the goal of being source-compatible
with earlier versions of C++, and even to some extent with C. This
makes sense, as backwards-compatibility between versions is sort of a
fundamental expectation of any programming language, certainly one that
tries to provide a modern set of abstractions, but it does restrain
C++’s development.</p>
<p>While Rust tries to be backwards compatible with itself, dropping
compatibility with C++ has allowed it to get out of a lot of C++’s
<a href="https://www.thecodedmessage.com/posts/multiparadigm/">accumulated cruft of complexity</a>, much
of which is inherited from C times.</p>
<p>This accomplishes a lot on its own. C++’s syntax has gotten
so complex over the years that many in the C++ community are
doing their own resets of the syntax, including Herb Sutter’s
<a href="https://github.com/hsutter/cppfront"><code>cppfront</code></a> and Google’s
<a href="https://github.com/carbon-language/carbon-lang">Carbon</a>. Even if
starting from scratch to accomplish C++’s goals was the only thing Rust
did, it would still result in a much better programming language, more
ergonomic and with fewer pitfalls.</p>
<p>Some criticize Rust by saying that in another 30 or 50 years, Rust will
end up as convoluted as C++ is now. This criticism has confused me,
because it seems possible, even likely, that this is true, but that
doesn’t strike me as a reason to not (gradually and responsibly) switch
from C++ to Rust (especially for new projects or for when rewrites are
particularly called for). If this is true, that just means programming
languages are subject to entropy and obsolescence like everything
else. And in that case, C++ will just continue to get worse, Rust will
also continue to get worse, and Rust will be better than C++ the entire
time. If all programming languages accrue cruft as they age, in what
world is that a reason to use the cruftier programming language?</p>
<p>Most Rustaceans are not, despite the stereotype, treating Rust as some
apocalyptic, messianic programming language to end all programming
languages. I wouldn’t be surprised if 20 or 30 years from now, a new
programming language will emerge, accomplishing the same goals from a
fresh start. And when that happens, I will probably advocate in favor
of this new programming language just like I now advocate in favor of
Rust.</p>
<p>The goal isn’t to have an eternally good programming languages; the
goal is to have tools now. What should new projects be written in now?
When a rewrite is called for (as it sometimes is), should it include a
new programming language now that there is a viable alternative?</p>
<p>I suspect that many making this argument are including an unstated
assumption – that C++’s cruft is actually a sign of its maturity, and
fitness for production use. Alternatively, and a little more charitably,
they might assume that Rust isn’t ready for production use yet, and by
the time it is, it will be just as crufty as C++, perhaps converging to
the same level of cruft. But while there are a few categories where Rust
lags C++, they are mistaken in the big picture. For the vast majority
of C++ projects, Rust is already a better option for if the project had
to be rewritten from scratch (a big “if,” but irrelevant to the merits
of the programming languages).</p>
<h1 id="rust-deficits">Rust Deficits</h1>
<p>Rust has a few downsides compared to C++.</p>
<p>Interfacing with C is an important goal for reasons besides
backwards-compatibility. On many platforms, C serves as a
lowest-common-denominator programming language, and its ABI serves as an
<a href="https://faultlore.com/blah/c-isnt-a-language/">inter-language protocol</a>.
C++ does provide smoother interfacing with this protocol than Rust does.</p>
<p>Relatedly, C++ generally has a relatively stable ABI on a given platform
for a given compiler vendor. This allows dynamic libraries to be used
as plugins with minimal glue code, something that in Rust normally
requires awkwardly working through a C ABI interface. Personally, I think
machine-language plugins as dynamically loaded libraries are mostly
a relic of past software distribution models, and haven’t seen many
situations where they make sense, but I could think of a few edge cases.</p>
<p>In both of these cases, Rust is clumsier, but not completely incapable.
Rust still can speak the protocol that is the C ABI, just not as natively
and smoothly-integrated as C++.</p>
<p>Other downsides of Rust have to do with network effects and Rust
adoption. There is only one Rust compiler, while there are multiple
C++ compilers, that work together through a standards process. GCC
is currently in the process of getting Rust support, and we’ll see
how well that works out for Rust.</p>
<p>Similarly, there are a lot of libraries that exist in C++ that don’t
yet exist in Rust or have Rust bindings. Though that’s true of any
pair of programming languages, it is a specific reason some developers
might still want to write new projects in C++ in favor of Rust.</p>
<p>Finally, while I still think Rust would be a better programming language
than C++ even if unsafe code were allowed everywhere, I think Rust
could do more to make its rules clearer in the unsafe realm. The fact
that the latest research on Rust’s memory models seems so deeply
difficult to square with how <code>async</code> code often works <a href="https://github.com/rust-lang/unsafe-code-guidelines/issues/148">as in this
bug report</a>
makes me nervous.</p>
<p>I’m sure there are other ways in which Rust is behind C++, and the
devil is as always in the details. I’m sure I’ll find out about some
of them as soon as I post this post.</p>
<h1 id="conclusion">Conclusion</h1>
<p>This was all topics I’ve discussed in other blog posts, but I hope this
brings some perspective on how I think about the programming languages
in general, and provides a conceptual framework for thinking about some
of my other posts. I was a fan of C++ because of its goals, and I’m now
a fan of Rust because I think Rust pulls them off better. When I was
skeptical of Rust, it was because I did not think Rust would pull them
off better, but that was due to a misunderstanding.</p>
<h1 id="next-steps">Next Steps</h1>
<p>I am considering using (a revised version of) this post
as an introduction, and then trying to bring all of my
Rust vs C++ content into an <code>mdbook</code> so it could be more of a
<a href="https://hapgood.us/2015/10/17/the-garden-and-the-stream-a-technopastoral/">garden</a>.
It would have a title like “Rust: A Better C++ Than C++” and be licensed
under some CC non-commercial license, and it would accept MRs from
other people as a community resource for consolidating resources on this
particular issue. Then, if I had further ideas I could put them in there.
What do people think of that idea?</p>
<p>I realize now that I write this that the repo where I
already have the bones of this idea is actually <a href="https://github.com/jhartzell42/rust-c-book">already
public</a>. I think I’m going
to restart from scratch with just a reorganization existing blog posts,
and save the more ambitious ideas in those notes files for later. What
do people think?</p>
Strattera in 100 Wordshttps://www.thecodedmessage.com/posts/100-words/2023-04-05T00:00:00+00:00It is half past noon. I have done no work, not even from bed, where I somehow still am.
My laptop is on, to-do list open. I’ve checked messages, even replied to one or two, but when I go to do something, I find myself several minutes later not having done it.
I’d used the bathroom instead. Or something else? What was it? What was I going to do first, again?<p>It is half past noon. I have done no work, not even from bed, where I
somehow still am.</p>
<p>My laptop is on, to-do list open. I’ve checked messages, even replied to
one or two, but when I go to do something, I find myself several minutes
later not having done it.</p>
<p>I’d used the bathroom instead. Or something else? What was it? What was
I going to do first, again?</p>
<p>Then I look at my side table. Oh, that makes sense! I am relieved and
upset at the same time.</p>
<p>I had forgotten to take my meds last night.</p>
Guest Collaboration: Paradigm Shifthttps://www.thecodedmessage.com/posts/paradigm-shift/2023-03-28T00:00:00+00:00Does the choice of programming language matter?
For years, many programmers would answer “no”. There was an “OOP consensus” across languages as different as C++ and Python. Choice of programming language was just a matter of which syntax to use to express the same OOP patterns, or what libraries were needed for the application. Language features like type checking or closures were seen as incidental, mere curiosities or distractions.
To the extent there was a spectrum of opinions, it was between OOP denizens and those that didn’t really think software architecture mattered at all — an feeble attempt of corporatization against true programmers and their free-spirited ways.<p>Does the choice of programming language matter?</p>
<p>For years, many programmers would answer “no”. There was an “OOP
consensus” across languages as different as C++ and Python. Choice
of programming language was just a matter of which syntax to use to
express the same OOP patterns, or what libraries were needed for the
application. Language features like type checking or closures were seen
as incidental, mere curiosities or distractions.</p>
<p>To the extent there was a spectrum of opinions, it was between OOP
denizens and those that didn’t really think software architecture mattered
at all — an feeble attempt of corporatization against true programmers
and their free-spirited ways. The office park versus the squatters. That’s
how we got the wave of so-called “scripting languages”.</p>
<p>But OOP was the least of their concerns. They shrugged along with some
sort of class system, and save their criticism for (static) types and
compilation (an implementation strategy, not language property).</p>
<p>Now, times are changing. When in the last 30 years have we seen so many
concurrent pivots in major languages?</p>
<p>Perhaps it began with lambdas. Once, they were seen as curiosities from
the functional world, a special case of an OOP class overriding a single
method (which is exactly how you had to write them in C++ in Java). Now,
Java has lambdas. Even JavaScript thought its <code>function()</code> syntax was too
heavy, replacing it with a lighter-weight <code>=></code>. Hold up, even <em>Excel</em> has
lambdas. Functional programming has intruded against the mainstream consensus.</p>
<p>When this intrusion broke through, the old equilibrium cracked. Both the
OOP consensus and scripting language counterculture started to crumble.
Now, Javascript, Python, and Ruby are getting type checking. Java is
getting a whole mish-mash of “functional” features. C++ is de-emphasizing
inheritance and doubling down instead on templates. Even Go is getting
generics.</p>
<p>So here we’ve reached a funny point. Before we had a bunch of languages
which roughly did the same thing. Now we have the same bunch of languages
all adopting the same features they never dreamed of having before. Within
that cohort there is <em>still</em> little reason to adopt one or another,
but over time there are clear reasons to choose the newer versions over
the older versions. You might not care about Java vs Go, but you sure
as hell want the version with generics over the versions that don’t.</p>
<p>So among 20+ year old languages, the choice of languages absolutely
matters for programmers with time machines (or contemplating Debian
stable), but what about for the rest of us?</p>
<p>Well, there are newer languages now mainstream (enough) too. And here we
find the front of the pack, the language bringing functional features
into the mainstream more completely and thoroughly than others (because
being born with them helps): Rust.</p>
<p>There are other languages zooming out in front of the pack, leading Rust
just as Rust leads the others. Being way out ahead is exciting. But it
can be lonely. It might be cold. And you might run out of steam. Being at
the front of the pack, the furthest along of the mainstream, is nice. You
still see where we’re going better. You go there early. But you’re not
alone; you’re shoulder to shoulder with others doing the same.</p>
<p>If that sounds nice, learn Rust. Don’t learn it as a mish-mash of exotic
cool features. And don’t let it lull you into thinking you must do some
sort of whiz-bang systems programming that almost no one does.</p>
<p>Learn Rust, idiomatic Rust, yes, for solving all the mundane problems
you face in your programming life, but also to get a head start on what
will be the next era of accepted programming practice. Learn type classes
(aka traits) in their full power (and not just the object-safe ones),
and learn how Rust’s move semantics can be used to simulate type-state.</p>
<p>These features might seem niche now, but remember, so once did lambdas.</p>
Rust Tidbits #1https://www.thecodedmessage.com/posts/rust-tidbits-1/2023-03-24T00:00:00+00:00This is a collection of little Rust thoughts that weren’t complicated enough for a full post. I saved them up until I had a few, and now I’m posting the collection. I plan on continuing to do this again for such little thoughts, thus the #1 in the title.
serde flattening What if you want to read a JSON file, process some of the fields, and write it back out, without changing the other fields?<p>This is a collection of little Rust thoughts that weren’t complicated
enough for a full post. I saved them up until I had a few, and now I’m
posting the collection. I plan on continuing to do this again for
such little thoughts, thus the #1 in the title.</p>
<h1 id="serde-flattening"><code>serde</code> flattening</h1>
<p>What if you want to read a JSON file, process some of the fields, and
write it back out, without changing the other fields? Can you still use
<code>serde</code>? Won’t it only keep fields that you know about in your data
structure?</p>
<p>Turns out, you can parse the fields you want, while also just
preserving the fields you don’t!</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[derive(Serialize, Deserialize)]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">struct</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> known_field: <span style="color:#a6e22e">KnownField</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> known_field2: <span style="color:#a6e22e">KnownField2</span>,
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">#[serde(flatten)]</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> unknown_fields: <span style="color:#a6e22e">BTreeMap</span><span style="color:#f92672"><</span>String, serde_json::Value<span style="color:#f92672">></span>,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>I found out about this <a href="https://serde.rs/attr-flatten.html#capture-additional-fields">in the <code>serde</code>
documentation</a>,
so it’s not an original insight, but it came in handy for me recently
and so I’m trying to raise awareness:</p>
<h1 id="let-surprises"><code>let</code> surprises!</h1>
<p>So, in Jon Gjengset’s popular Twitter thread <a href="https://www.thecodedmessage.com/posts/trivia-rust-types/">transcribed
here</a>, he wrote this:</p>
<blockquote>
<p>Did you know that whether or not <code>let _ = x</code> should move <code>x</code> is actually
fairly subtle?
<a href="https://github.com/rust-lang/rust/issues/10488">https://github.com/rust-lang/rust/issues/10488</a></p>
</blockquote>
<p>I didn’t think much of this, besides making a note to self not to use
<code>let _ = x</code> to ever drop anything, which hopefully I wouldn’t have done
anyway because <code>drop(x)</code> is much more self-evident in what it intends.
I remember also vaguely hoping that it did drop, because in my mind that
was the obvious, logical thing for it to do.</p>
<p>But then later, as I was writing a <code>match</code>, I realized why <code>_</code> couldn’t
mean drop, from the match context:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">match</span> foo.bar.baz {
</span></span><span style="display:flex;"><span> MyEnum::Option1(_) <span style="color:#f92672">=></span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// This shouldn't move from `foo.bar.baz`, but just
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// inspects whether it is `MyEnum::Option1`. Otherwise, there'd
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// be no straight-forward way to perform that inspection!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">//
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// And indeed, it doesn't.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> None
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> MyEnum::Option2(<span style="color:#66d9ef">ref</span> baz_inner) <span style="color:#f92672">=></span> {
</span></span><span style="display:flex;"><span> Some(foobar(baz_inner))
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>So, if <code>let _ = x</code> was to be consistent with this use case, well,
that meant that <code>_</code> has to not drop, as it’s important for <code>_</code> to
mean the same thing. And, after all, the left-hand side of a <code>let</code>
is just another pattern context!</p>
<p>But wait, I thought! Does this mean that you can write <code>let ref x = y;</code>? Yes, it does. It’s just another way of writing <code>let x = &y;</code>…
But just because you can write it that way, doesn’t mean you should.
Keeping to idiom is important.</p>
<p>Nevertheless, fun fact! The more you know!</p>
<h1 id="remember-serde-structs-can-be-function-local">Remember: <code>serde</code> <code>struct</code>s Can Be Function-Local</h1>
<p>Let’s say you need to extract three fields out of some JSON, like
<code>name</code>, <code>age</code>, and <code>phone_number</code> (which, ironically, is a string in
JSON terms, and not a number). One of the great things about Rust
and <code>serde</code> is that you can just write those fields in a <code>struct</code>
with the <code>Deserialize</code> trait (which is deriveable)
and grab the values into such a struct, even if there’s
other actual fields in the JSON:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[derive(Deserialize)]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">Person</span> {
</span></span><span style="display:flex;"><span> name: String,
</span></span><span style="display:flex;"><span> phone_number: String,
</span></span><span style="display:flex;"><span> age: <span style="color:#66d9ef">f64</span>,
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> person: <span style="color:#a6e22e">Person</span> <span style="color:#f92672">=</span> serde_json::from_str(json_str);
</span></span></code></pre></div><p>The question then becomes, where should <code>Person</code> go? Well, if you
plan on passing around this <code>Person</code> value, and structuring the rest
of your code in terms of it, then it should be a prominent type.</p>
<p>But more often, especially in my own code, I immediately split such
a structure into its constituent parts, which I then will use for
other things:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> Person {
</span></span><span style="display:flex;"><span> name,
</span></span><span style="display:flex;"><span> phone_number,
</span></span><span style="display:flex;"><span> age,
</span></span><span style="display:flex;"><span>} <span style="color:#f92672">=</span> serde_json::from_str(json_str);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle <span style="color:#f92672">=</span> person_database.lookup(<span style="color:#f92672">&</span>name)<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span>handle.set_phone_number(<span style="color:#f92672">&</span>phone_number);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> demographic <span style="color:#f92672">=</span> demographic_for_age(age.trunc() <span style="color:#66d9ef">as</span> <span style="color:#66d9ef">u32</span>);
</span></span></code></pre></div><p>This is very reasonable. It makes sense that our internal data
structures would be designed for whatever logic we want to do on
them, rather than having them coincidentally match the wire format.
For most complicated applications, having the internal data
format match the wire format literally is actually sort of a code
smell.</p>
<p>So, we often will have types that we use to deserialize (and serialize)
JSON in exactly one function. In that situation, the type should in
fact be written locally to that function. So in the example above,
where <code>struct Person { ... }</code> is immediately followed by the
<code>serde_json::from_str</code>, I didn’t just write them next to each
other as convenience. I would literally put them together in
a function:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">do_thing</span>(json_str: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) -> Result<span style="color:#f92672"><</span>()<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> do_something_else()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">#[derive(Deserialize)]</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">Person</span> {
</span></span><span style="display:flex;"><span> name: String,
</span></span><span style="display:flex;"><span> phone_number: String,
</span></span><span style="display:flex;"><span> age: <span style="color:#66d9ef">f64</span>,
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> Person {
</span></span><span style="display:flex;"><span> name,
</span></span><span style="display:flex;"><span> phone_number,
</span></span><span style="display:flex;"><span> age,
</span></span><span style="display:flex;"><span> } <span style="color:#f92672">=</span> serde_json::from_str(json_str);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> handle <span style="color:#f92672">=</span> person_database.lookup(<span style="color:#f92672">&</span>name)<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> handle.set_phone_number(<span style="color:#f92672">&</span>phone_number);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> demographic <span style="color:#f92672">=</span> demographic_for_age(age.trunc() <span style="color:#66d9ef">as</span> <span style="color:#66d9ef">u32</span>);
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>I bring this up mostly because many programmers don’t seem to be
aware that you can do this, or don’t think to. I’ve seen people write
types like <code>Person</code> at the top level. I realize that many programming
languages either don’t let you do this sort of embedding, or else strongly
discourage it. But I’m a big believer in giving things the least scope
they need, and for many <code>serde</code>-related types, that’s function scope.</p>
<h1 id="rust-shadowing">Rust Shadowing</h1>
<p>Speaking of minimal scope, I wanted to write in praise of Rust’s penchant
for shadowing that allows you to not have to come up with a bunch of names
for the same thing. Oftentimes, we just convert the same information
from type to type: wire format in bytes, to parsed wire format, to
application domain format (wrapped in an <code>Option</code> in a <code>Result</code>), to
application domain format with errors and absence handled (not wrapped
in those things… Fortunately, Rust lets us shadow and re-use names
for these different variables, and ultimately we get code that looks
something like this (although no type annotations are normally necessary):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> foo: <span style="color:#a6e22e">FooTypeC</span> <span style="color:#f92672">=</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> foo: <span style="color:#a6e22e">FooTypeA</span> <span style="color:#f92672">=</span> get_foo();
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">let</span> foo: <span style="color:#a6e22e">FooTypeB</span> <span style="color:#f92672">=</span> transform_foo(<span style="color:#f92672">&</span>foo)<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">match</span> foo {
</span></span><span style="display:flex;"><span> Some(foo) <span style="color:#f92672">=></span> transform_foo_again(foo)<span style="color:#f92672">?</span>,
</span></span><span style="display:flex;"><span> None <span style="color:#f92672">=></span> FooTypeC::default(),
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>This is really helpful, along with the fact that braces <code>{</code> … <code>}</code>
enclose expressions, in really minimizing how much scope each variable
has. But it’s also really helpful, because if shadowing wasn’t available,
what would we name all these different variables? <code>foo_a</code> and <code>foo_b</code>
and similar stupid names? This is an issue in certain other programming
languages where shadowing isn’t as straight-forward, and the <a href="https://www.thecodedmessage.com/posts/hungarian/">results
aren’t fun</a>.</p>
Treat Tolkien's World Like Other Mythologieshttps://www.thecodedmessage.com/posts/tolkien/2023-03-23T00:00:00+00:00Tolkien was trying to make a new mythology, a new set of deeply resonant stories, for modern (especially English) culture, and he succeeded. He transformed fantasy, and founded the concept of high fantasy. His detailed legendarium (as his mythology is called) is a masterpiece of world-building, with deep symbolism and emotional complexity, a mythology with arguably more depth and room to explore than many ancient ones. Tolkien scholars work full-time to study it, and many more people draw from it explicitly and implicitly for their own art, in D&D and other more modern fantasy settings.<p>Tolkien was trying to make a new mythology, a new set of deeply resonant
stories, for modern (especially English) culture, and he succeeded.
He transformed fantasy, and founded the concept of <em>high fantasy</em>. His
detailed <em>legendarium</em> (as his mythology is called) is a masterpiece of
world-building, with deep symbolism and emotional complexity, a mythology
with arguably more depth and room to explore than many ancient ones.
Tolkien scholars work full-time to study it, and many more people draw
from it explicitly and implicitly for their own art, in D&D and other
more modern fantasy settings. Especially with his near-human species,
his concepts of hobbits (off-brand as halflings) and elves (distinct
from previous iterations) have deeply resonated with many people.</p>
<p>And yet, relatively few of the works from his legendarium are actually
that enjoyable (or even feasible) to read. Only the works published in
his lifetime, <em>The Hobbit</em> and <em>The Lord of the Rings</em>, are readable as
literature by modern audiences – and many modern readers even struggle
to get through the reams of poetry and milieu-building that make up
<em>The Lord of the Rings</em>, with many fans only familiar with the much more
digestible Peter Jackson movies.</p>
<p>As for the posthumously published works, even though they more fully
flesh out his beautiful and intricate imagined history – detailing,
for example, the character of the elves, the creation of that world –
they are extraordinarily dense and heavy reading. <em>The Silmarillion</em>
has famously been compared by many readers to the <em>Old Testament</em>, and
this was definitely meant as an insult – although I personally am a
huge fan of the <em>Hebrew Bible</em> as literature, this is a large part of
why my friends consider me such an eccentric.</p>
<p>It’s easy to understand why. <em>The Silmarillion</em> was composed in a
remarkably similar process to how historical-critical scholars say
the Hebrew Bible was composed. That is to say, both were composed
after the fact, by a redactor. For both, this redactor stitched together
somewhat-contradictory stories in their rudimentary form into a consistent
order, with minimal editing and no attempt at expansion. Many of the
original sources read as synopses, and the only consistency of voice is
self-conscious archaism and reverence.</p>
<p>And the more recent publications are worse, and even more like
modern editions of ancient texts. <em>Beren and Luthien</em> is stitched
together between poetry and prose, and as full of footnotes as many
“study Bibles” I’ve seen. This is beautiful work – or at least the
source material is. But rather than presented in a form that can be
enjoyed, it is only presented in a form that can be studied. It is
treated like we treat the literature of an ancient civilization (as
C.S. Lewis complained about in his <a href="https://www.uniontheology.org/resources/doctrine/jesus/introduction-to-on-the-incarnation">Introduction to Athanasius’ <em>On the
Incarnation</em></a>),
rather than the writing of someone who died within living memory and
whose works are now a multi-billion dollar media franchise.</p>
<p>And that’s a damn shame, because the world that Tolkien created is
beautiful, and the stories that he grew within it are beautiful, and
deserves a presentation, a literary realization, as beautiful as the
underlying concepts. You shouldn’t have to be the type of nerd who is
intrinsically driven to read through tedious notes to see the underlying
beauty that is the Tolkien <em>legendarium</em> – the First and Second Ages
of Middle Earth deserve to be portrayed through engaging, well-written
literature, like the end of the Third Age is in <em>The Lord of the Rings</em>,
rather than a study Bible that can only be read by those whose devotion
to Tolkien borders on religious – the hyper-nerds (and I number myself
among them) whose very existence testify to how great the ideas are.</p>
<p>Unfortunately, too many Tolkien hyper-nerds feel that their ability
to access this is a compliment towards them – that it’s the reader’s
fault for not liking or being able to get through <em>The Silmarillion</em>,
for not being dedicated enough. But this attitude – in addition to
being arrogant, ableist, patronizing, and damaging to the reputation of
Tolkien fans as a whole – is simply not worthy of the beautiful world
that Tolkien built. Tolkien was trying to craft this world into publishable
books.</p>
<p>That world should be accessible to as many people as possible, not only
some “elites” who are willing to go footnote-diving. If that world
is so beautiful that it inspires some people to do incredible feats
of research to try to understand it, it is beautiful enough that
it should also be shown to those who (quite reasonably) don’t
have the time or energy for such activities.</p>
<p>This is a solvable problem – in fact, many other literary franchises have
solved it handily: The franchise should be opened up to collaborators.
There would be no shortage of talent: Tolkien’s work is well-loved within
the genre, even foundational. Many fantasy authors – top-tier ones –
would consider it a great honor to be able to write within Tolkien’s
<em>legendarium</em> in an authorized fashion.</p>
<p>But for it to work, the Tolkien estate would have to allow those
collaborators to do their job. We have to allow them to question and
add complexity to Tolkien’s themes, to explore some of the awkward
components (like the moral status of the orcs, or questions of races of
men and apparent races of elves) at their own discretion. They have
to be allowed to make adjustments to the canon – something Tolkien
would’ve done freely himself.</p>
<p>Perhaps it would be made easier if there was no attempt to keep the
extended canon strictly consistent, if they rejected that as a possible
goal from the outset. Some rules and negotiation will doubtless be
necessary, but complete alignment to canon and literary excellence are
fundamentally incompatible goals – and literary excellence the more
important one.</p>
<p>Because of course, these works are literally
not scripture, or ancient texts. They are <em>modern
fiction</em>, and like many works of fiction they deserve to
be taken seriously – but not religiously. <a href="https://www.nytimes.com/2022/09/21/world/europe/giorgia-meloni-lord-of-the-rings.html">Those who do take it
religiously</a>
are not the best company to keep.</p>
<p>But even if they were ancient texts, allowing an open, logically
fluid canon would be appropriate. After all, Tolkien was trying to
build a modern mythology – a <em>legendarium</em> – and ancient mythologies
are contradictions and fanfic the whole way down. Remember Achilles'
Heel? It seems a core part of the mythos. But not only are there multiple
versions of that story that disagree on how the rest of him was made
invulnerable, it also doesn’t even appear in the <em>Iliad</em> which considers
him as vulnerable as any mortal. There are even versions of Achilles'
story where he dies a normal death being shot in the back.</p>
<p>Throughout antiquity, every time a new poet or playwright would set a
Greek myth to writing, they’d put their own spin on it. When modern
writers do the same, they’re not ignoring or changing or misrepresenting
Greek mythology, but just continuing the same pattern.</p>
<p>This is nothing against Tolkien scholarship, and trying to study his mind
and the original intent behind the <em>legendarium</em> – that is also a
good thing. But perhaps that scholarship would be more useful if
it had an outlet in the creation of new works.</p>
<h1 id="the-tv-show">The TV Show</h1>
<p>Of course, so far I’ve avoided the elephant in the room – the new
<em>Rings of Power</em> TV shows. So I will address that now.</p>
<p>I acknowledge many of the problems with them. I was particularly
disappointed by the “elves taking our <del>jobs</del> trades” concept.
Rather than express the original, interesting reasons that men had
become bigoted against elves in Númenor – jealousy of Elvish immortality
and closeness to the gods – they fell back on a cheap political
reference. I’m OK with changing the canon, but it doesn’t work. Elves
are the colonial power, the more privileged species, and “taking our
jobs” is generally a line that is used by the more privileged against the
less privileged. It makes no thematic sense, and I have to simply pretend
they said something else in order to keep watching the show.</p>
<p>(Diverse casting is, to be clear, not a problem with them. It’s a fantasy
world, and they’re actors. Literally all of the elves are also – gasp! –
depicted by non-elvish actors. It’s unfortunate that that conversation
took up well-needed space for better conversations about the show.)</p>
<p>But that’s the risk of opening the canon up, and I accept it. I did
enjoy <em>Rings of Power</em> a lot, just for depicting on screen many places
and events that were emotionally resonant for me. I am happy it was made,
flaws and all – while still not really counting it as “canon” in my mind.
I hope they recover from many of their flaws, and I hope more work
like it is done.</p>
<p>Because more important than “getting everything right,” it presented this
world in a way that many of my friends could enjoy it. Basically none of
my friends would be willing to read <em>The Silmarillion</em> just because they
would enjoy discussing it with me (or for any reason at all). But many
of my friends watched the show, and enjoyed it, and those discussions
have been great.</p>
<p>Now, imagine if more such works were made, and by established fantasy greats!</p>
The Importance of Logginghttps://www.thecodedmessage.com/posts/logging/2023-03-21T00:00:00+00:00Intro programming classes will nag you to do all sorts of programming chores: make sure your code actually compiles, write unit tests, write comments, split the code into functions (though sometimes the commenting and factoring advice is bad). Today, however, I want to talk about one little chore, one particular little habit, that is just as essential as all of those things, but rarely covered in the CS100 lectures or grading rubrics: logging.<p>Intro programming classes will nag you to do all sorts of
programming chores: make sure your code actually compiles,
write unit tests, write comments, split the code into functions
(though sometimes the commenting and factoring advice is
<a href="https://www.hillelwayne.com/post/what-comments/">bad</a>). Today, however,
I want to talk about one little chore, one particular little habit, that
is just as essential as all of those things, but rarely covered in the
CS100 lectures or grading rubrics: logging.</p>
<p>And why am I choosing this particular topic for a blog post today?
Simple: It’s to punish an earlier version of myself for not logging
enough, for not caring about logging enough. It turns out it’s important.
But I’ll get back to the OOP blog series soon enough, don’t worry!</p>
<p>Logging – writing text describing what’s been happening in your program
to a file or other storage system – is essential for any software
system. Luckily, Rust has a (nearly) standard logging framework,
technically outside the standard library but maintained by many of
the same people and solidly endorsed by the community: <a href="https://docs.rs/log/latest/log/">the <code>log</code>
crate</a>. But note: Even though this post
is written specifically for Rustaceans, much of the advice and commentary
in here will apply to logging systems in all programming languages.</p>
<p>Logging is essential for debugging and troubleshooting. When you
find a bug, you need to find out which specific part of the program is
actually broken out of the many parts, because it’s often not the part
that’s visbly acting weird. This is often the first step in addressing a
new bug after <a href="https://www.thecodedmessage.com/posts/reproducibility/">reproducing it</a>, or even part of
figuring out how to reproduce it – or the step before that, so
obvious it goes without saying, of noticing that a bug exists.</p>
<p>In fact, logs can be helpful at every stage of the debugging process.
You have to confirm your assumptions on what parts are known to
work. After all, the whole program is supposed to work, and often times,
the thing that’s broken is something that you would’ve assumed definitely
worked, until absolutely everything else was ruled out.</p>
<p>Every programmer understands this intuitively, even as a student or a
beginning self-taught programmer: When you are developing a project, and
it’s not working, the easiest <em>ad hoc</em> debugging technique is “debug print
statements,” a go-to technique of CS100 students worldwide. Ironically,
CS100 professors often advocate against this in favor of debuggers, in
spite of the fact that logging, the grown-up version of debug prints,
is more generally useful, as code often exhibits bugs in environments
where it didn’t happen to be running in a debugger, like production.</p>
<p>Debug prints work, by accomplishing two goals:</p>
<ol>
<li>Verifying that the program got to the point of that debug print line.</li>
<li>Verifying that the data it has at that point is correct.</li>
</ol>
<p>Logging is fundamentally debug print statements, but phrased and
annotated correctly, so that it looks professional both in the code
and in the log, and uses actual logging mechanisms with
timestamps and log levels and stuff.</p>
<p>So instead of:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span>initialize_rainbows();
</span></span><span style="display:flex;"><span>println!(<span style="color:#e6db74">"Got here 2"</span>);
</span></span><span style="display:flex;"><span>initialize_sunshine();
</span></span><span style="display:flex;"><span>println(<span style="color:#e6db74">"Got here!!!!"</span>);
</span></span></code></pre></div><p>You write the much nicer-looking:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span>initialize_rainbows();
</span></span><span style="display:flex;"><span>info!(<span style="color:#e6db74">"Rainbows fully initialized"</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>initialize_sunshine();
</span></span><span style="display:flex;"><span>info!(<span style="color:#e6db74">"Sunshine fully initialized"</span>);
</span></span></code></pre></div><h1 id="when-to-log">When To Log</h1>
<p>You should log as much as possible.</p>
<p>Every time you make a decision, you should log it. Every time you query
a URL or build a string of some kind, you should log it. Every time
you load a config parameter, you should definitely log it. This might
seem silly, because you’re duplicating the configuration file, but
a bug processing configuration (or prioritizing different sources of
configuration) can be especially hard to find.</p>
<p>Logging can be used instead of comments to organize functions into
parts. If you feel the need to tell the reader of your code
what each part of a function does, perhaps you should tell your
poor ops person which parts you’ve reached in the same breath.
So instead of:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">close_out_section</span>(self) -> Result<span style="color:#f92672"><</span>()<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Flush dirty data
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">for</span> datum <span style="color:#66d9ef">in</span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self.data {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> datum.is_dirty() {
</span></span><span style="display:flex;"><span> datum.flush()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Close files
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">for</span> file <span style="color:#66d9ef">in</span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self.files {
</span></span><span style="display:flex;"><span> file.flush()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> file.close();
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> decrease_global_section_count()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> Ok(())
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>You could write:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">close_out_section</span>(self) -> Result<span style="color:#f92672"><</span>()<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> info!(<span style="color:#e6db74">"Closing out section: {}"</span>, self.name);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> debug!(<span style="color:#e6db74">"Flushing dirty data"</span>);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> datum <span style="color:#66d9ef">in</span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self.data {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> datum.is_dirty() {
</span></span><span style="display:flex;"><span> trace!(<span style="color:#e6db74">"{} is dirty, flushing..."</span>, datum.name);
</span></span><span style="display:flex;"><span> datum.flush()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> debug!(<span style="color:#e6db74">"Closing files"</span>);
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> file <span style="color:#66d9ef">in</span> <span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self.files {
</span></span><span style="display:flex;"><span> trace!(<span style="color:#e6db74">"Closing {}"</span>, file.name);
</span></span><span style="display:flex;"><span> file.flush()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span> file.close();
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> debug!(<span style="color:#e6db74">"Decreasing global section count"</span>);
</span></span><span style="display:flex;"><span> decrease_global_section_count()<span style="color:#f92672">?</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> debug!(<span style="color:#e6db74">"Section successfully closed!"</span>);
</span></span><span style="display:flex;"><span> Ok(())
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>These log statements serve both as comments to your reader and information
to your administrator at the same time! And, since you are writing to
someone who is perhaps not looking at the source code, you don’t feel
silly adding even more information that’d be obvious to a reader –
which is useful also to readers of the source code, who might not share
your definition of what is obvious. In spite of what you may have heard,
it’s still a good idea to err on the side of explaining things more
<a href="https://www.hillelwayne.com/post/what-comments/">in comments</a>. (Yes,
I linked that post twice. It’s that good.)</p>
<p>You may object that all this logging might slow down your process a
little, and I can see wanting to avoid it in the middle of a computational
loop. But oftentimes, people avoid logging when there is no possible
performance excuse, when much slower I/O is happening all around it,
in comparison to which the logging would be a rounding error. Remember
that famous Donald Knuth quote: “[P]remature optimization is the root
of all evil….”</p>
<h1 id="log-levels">Log Levels</h1>
<p>In addition to performance, you might claim that the amount of logging
that I show above is spammy, and that the resulting log files would
cause an information overload.
But our programming foreparents were wise, and created an additional
tool to address both this, and the potential performance problems: log
levels.</p>
<p>An error message is different from a warning is different from information
is different from debug printing. We want to distinguish these, so we can
avoid seeing insufficiently important logs. There are many systems of log
levels, and Rust’s <code>log</code> crate endorses a pretty typical list, enumerated
in its <a href="https://docs.rs/log/latest/log/enum.Level.html"><code>Level</code> enum</a>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">Level</span> {
</span></span><span style="display:flex;"><span> Error,
</span></span><span style="display:flex;"><span> Warn,
</span></span><span style="display:flex;"><span> Info,
</span></span><span style="display:flex;"><span> Debug,
</span></span><span style="display:flex;"><span> Trace,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>They form an ordered, descending scale of severity, so that <code>Trace</code> is
the least severe. You probably always want to enable <code>Error</code>-level logs
(though even they can be turned off) but you probably only want to enable
<code>Trace</code>-level logs if you’re doing some serious debugging.</p>
<p>In recognition of how the levels are ordered, log filtering is typically
done by setting a level, and then logs of that level or more severe
are let through. So if the level is <code>Debug</code>, <code>Warn</code> logs are also outputted,
but if it is <code>Error</code>, <code>Warn</code> logs are suppressed. See the <a href="https://docs.rs/log/latest/log/enum.LevelFilter.html"><code>LevelFilter</code>
enum</a>.</p>
<p>Errors are for problems that stop the process, or at least the specific
thing the process was doing (e.g. API or RPC request being serviced).
Warnings are for where something seems wrong but we’re going to do it
anyway.</p>
<p>Info, debug, and trace are honestly kind of just labels with decreasingly
urgent-sounding names, levels for the sake of levels. You should
use them according to importance, so that most of the absolute nonsense
can get filtered out as mere <code>trace</code>, like implementation details
or extra information. You also want the occasional interesting
high-level stuff to be captured with <code>info</code>, like what high-level task
is the process currently working on. Medium-level tasks can get <code>debug</code>.</p>
<p>In general, the more performance-critical the code, the lower the log
level you want to use, to increase the likelihood that you’ll just
have a (very predictable) branch to indicate that you don’t need to
print that line. Then, if there’s an actual problem, an operator can
raise the log level (which they can sometimes do on a per-module basis)
when those lines are worth seeing.</p>
<p>As a corrollary, configuration should use <code>info</code> and <code>warn</code> heavily, and
generally log at higher log levels. Configuration only happens once,
and in one section, so it’s allowed to be spammy. Furthermore, raising
the log level at run-time won’t help reveal more configuration logs:
unless the configuration is re-processed, you’ve just already missed
those messages. Finally, configuration is never too latency sensitive
for logging – configuration is the least performance sensitive part
of your program.</p>
<p>So there is no excuse. Loading different configuration than you thought
you had is a shockingly common cause of bugs and confusing system
behavior. Log obsessively in your configuration code, at high log levels.</p>
<h1 id="using-the-log-crate-in-your-rust-projects">Using the Log Crate in Your Rust Projects</h1>
<p>So how do we log in Rust?</p>
<p><code>log</code> is a framework – in the words of its well-written
<a href="https://docs.rs/log/latest/log/">documentation</a>, it is a “lightweight
logging facade.” The front-end is shared: You output logs through the
<code>log</code> crate itself. The backend is pluggable, meaning that different
backends exist with different features.</p>
<p>As a result, as the documentation says, libraries should
just use the <code>log</code> crate, so that when they output logs,
it will work with any backend. Applications choose the
backends, and import an appropriate crate, like for example
<a href="https://docs.rs/env_logger/latest/env_logger/index.html"><code>env_logger</code></a>.
The <code>log</code> documentation has a <a href="https://docs.rs/log/latest/log/#available-logging-implementations">list of available backend
crates</a>.</p>
<p>This split between what crates should be used by libraries as opposed
to application is not uncommon in Rust. For example, it also comes up
with error handling, where libraries should generally use <code>thiserror</code> to
preserve error information in a way that applications can programmatically
investigate, but applications generally want to use <code>anyhow</code> and <code>eyre</code>
to ergonomically convey any errors they cannot handle to the user.</p>
Write Everything Down (Part 4): My Desktop Environmenthttps://www.thecodedmessage.com/posts/org4-desktop/2023-02-28T00:00:00+00:00I’d like to share with you how I use my computer, in a way that is (for me) ADHD friendly and well-suited for implementing my organization system. Tools are important to any organizational and productivity system, and optimizing your tools for your brain and your workflow are important. My computer is my most important productivity tool, where my work happens, and where my life/chore/errand/calendar organization happens, so it should be an interesting example of an optimized key tool.<p>I’d like to share with you how I use my computer, in a way that is
(for me) <a href="https://www.thecodedmessage.com/tags/adhd/">ADHD</a> friendly and well-suited for implementing
my <a href="https://www.thecodedmessage.com/posts/my-organization-system/">organization system</a>. Tools are
important to any organizational and productivity system, and optimizing
your tools for your brain and your workflow are important. My computer
is my most important productivity tool, where my work happens, and where
my life/chore/errand/calendar organization happens, so it should be an
interesting example of an optimized key tool.</p>
<blockquote>
<p><strong>Note:</strong> I consider this a non-technical post, as it is intended for
a general audience. Even though it is about a computer set-up that
I’m not recommending to a non-technical audience, this <em>description</em>
and <em>explanation</em> of my computer set-up should be accessible enough for
everybody. However, it is also literally about computers, so it’s going
in the “Computers/Programming Posts” bucket as well – and therefore it
will show up under both feeds.</p>
</blockquote>
<p>It’s been some time since I’ve written about
<a href="https://www.thecodedmessage.com/tags/organization">organization</a> – I had basically paused the series
until further inspiration struck. I had even outlined this very post,
and considered writing in more detail about my personal computer usage,
how my desktop actually looks, and the actual techniques I use to get
this machine to work for me for programming, blogging, and planning. The
reason I didn’t was basically because I didn’t think it would be
interesting enough.</p>
<p>But inspiration did finally strike, in the form of two things that
changed my mind and convinced me that there was an audience for this
post, two things that happened very close together in time:</p>
<ol>
<li>
<p>I learned that huge numbers of people were excited to hear about how
<a href="https://www.cgpgrey.com/">somebody</a> had optimized their arrangement
of iPhone app icons on the <a href="https://www.relay.fm/cortex">Cortex</a>
podcast. This was a completely standard iPhone, running unmodified,
not-even-jailbroken iOS – perhaps the least customizable, least
interesting consumer operating system out there. If huge numbers of
people were interested in how icons are arranged on iOS, and how that
can be optimized for productivity and to match someone’s brain, people
will definitely be interested in how I use my computers, which do not
even use a normal user interface for Linux and are extremely customized
to how I think.</p>
</li>
<li>
<p>Several friends of mine in rapid succession thought that my
computer interface was worthy of comment to me or to others as
a way of characterizing me. One friend even said, when I showed her
how a few <code>vim</code> commands worked, that she understood why I used this
for my organization files.</p>
</li>
</ol>
<p>So I’ll start by taking a screenshot of how my desktop looks right now,
literally as I write this, to use as a conversational starting point:</p>
<p><img src="https://www.thecodedmessage.com/desktop-screenshot.png" alt="Desktop Screenshot"></p>
<p>I know I’ve shown some screenshots in my <a href="https://www.thecodedmessage.com/posts/my-organization-system/">last
post</a>, but this time we’re going
to discuss it in some more detail.</p>
<p>It looks very … computer-y. Very low-level. Very much as if I’m
doing programming, even though I’m actually doing blogging.</p>
<p>It’s not just the presence of the terminal, either, though just using
a command line is considered to be advanced or even programmer-level
computer usage these days. It’s the whole aesthetic. There’s no <a href="https://en.wikipedia.org/wiki/Window_%28computing%29#Window_decoration">window
decorations</a>
on either the left side of the screen where I’m editing my post, nor on
the right side of the screen where I’m having a command line session –
that is to say, no title bar, no minimize-maximize-close icons, no menu
bars. Clearly, if I want to save the file I’m working on, I can’t go up
to the menu and click <code>File -> Save</code>. And, actually, there don’t seem
to be any places designed for clicking at all.</p>
<p>Along the top, instead of a start menu or a system menu or a dock of
application launchers, I have a bunch of status information, formatted in
such a way so that you have to know what you’re reading to understand it:
one number highlighted out of several; the word <code>Tall</code>; <code>jim@palatinate: ~/Writing/TheCodedMessage/conte...</code>, which is the same text as my prompt
in the terminal, and indicates who I am, what computer I’m logged into,
and what directory I’m currently in (in the currently highlighted
terminal). Then, what WiFi I’m connected to, my CPU percentage, memory
usage, date and time, and battery status.</p>
<p>There’s not a single icon among these status indicators – it’s just a
long line of text. Text, that goes well with the text of the blog post
I’m editing and the text of the command line. I can see why sometimes
friends refer to my computer interface as “not logged into a graphical
environment” or “in text mode” or “in command line only mode” – even
though that is actually a thing, and is literally not the situation my
computer is in.</p>
<p>It’s a modern graphical login session! Here’s me using a web browser if
you don’t believe me (Chromium is off-brand Chrome, made from the
same source code):</p>
<p><img src="https://www.thecodedmessage.com/web.png" alt="Web Browser Screenshot"></p>
<p>And of course, I can also view videos with <a href="https://www.videolan.org/">VLC</a>
or look at pictures with <a href="https://help.gnome.org/users/eog/stable/">Eye of
GNOME</a> (yes, I can use GNOME
components even though I don’t use the GNOME desktop environment), and
in literal text mode, that wouldn’t be possible.</p>
<p>But I understand why people call my set-up text mode, and now that I’m
paying attention, I see that in a very literal sense, there aren’t any
images or icons at all on my screen right now, just text in various
colors. That is an intentional choice, and how I like it, and it
does have to do with me being a programmer (in at least being aware
of my options and capable of configuring it), so fair.</p>
<p>So what is going on? Why does my computer look so text-y, even if
it’s not technically text-mode?</p>
<h1 id="xmonad">xmonad</h1>
<p>To be clear, my set-up is not typical of how Linux computers
normally look. On Linux, you get your choice of desktop
interface, of what software draws things like window borders
and docks and start menus. Usually, people use ones like
<a href="https://www.gnome.org/">GNOME</a> or <a href="https://kde.org/">KDE</a> (or
dozens of others), which look much more like macOS or Windows,
with a normal amount of icons, and sometimes even futuristic,
overly dynamic graphics. Here’s a screenshot of KDE from <a href="https://commons.wikimedia.org/wiki/File:Screenshot_of_KDE_4.3.png">Wikimedia
Commons</a>
to demonstrate:</p>
<p><img src="https://www.thecodedmessage.com/kde.png" alt="KDE"></p>
<p>But I instead chose <a href="https://xmonad.org/">xmonad</a>, which is designed
for things like minimalism, deep configurability, and keyboard control
– and in general designed almost exactly for my priorities. My XMonad
set-up is not that weird, for an XMonad set-up. Like any XMonad set-up,
however, it is deeply customized to my particular workflow.</p>
<p>But before we get into my customizations and use of it, I’d like to talk
a bit about why I prefer XMonad to other, more traditional desktop
environments. It’s not to be weird or to show off my technical skills or
even to communicate that I’m a programmer and a nerd – I actually don’t
very much like that people think I’m programming when I’m actually working
on a writing project, nor do I like that other people find borrowing my
laptop intimidating. Instead, it’s about adapting to what I feel comfortable
with, and what works well with how my brain works.</p>
<p>So the lack of distractions, the lack of icons, is actually very important
to helping me focus, as is the simplicity of the interface. My ADHD
doesn’t manifest by having my eyes be regularly pulled away to where
the icons are because they’re pretty – or at least, if it does I’m not
aware of it. But if there is a dock of icons on the screen, my awareness
that the dock is there can be a distraction to me, taking up precious
space in my brain of very limited short-term memory that could be better
served juggling the other things going on in the computer. This distraction
even happens on macOS, even when the dock is hidden – I have to be aware
of it so I know <em>not</em> to move my mouse to the bottom of the screen, or that
if I do, I will suddenly see icons.</p>
<p>The title bars that typically line the top of windows are such a
distraction, as are the menu bars (with <code>File</code>, <code>Edit</code>, etc.) that
give you a list of things to do. If I were designing an operating
system UI from scratch – which I have often fantasized about –
the menu would show up as an overlay on top of the window when you
pressed the <code>[ALT]</code> key, and a list of available keyboard shortcuts
would show up when you pressed and released <code>[CTRL]</code>, reminding
you that paste, for example, is <code>Ctrl-V</code>.</p>
<p>Back in real life, I also don’t have menu bars on my machine for my most
commonly used apps. But the replacement, unfortunately, isn’t an overlay,
but simply knowing the relevant commands for both <code>gvim</code> and terminal,
both literal commands, and keyboard and mouse gestures, like
<code>Ctrl-D</code> to log out or middle-click to paste the last thing you
highlighted – because I find <code>Ctrl-C</code>/<code>Ctrl-V</code> too tedious and
prefer copy-and-paste through the “secondary clipboard” Linux
supports: highlight and middle mouse click, or three fingers on
my laptop trackpad.</p>
<p>The streamlined simplicity allows me to just see the text of the actual
app I’m using. It reminds me of math textbooks. I prefer math textbooks
that just are about math. I saw a math textbook for the high school level
once that was full of pictures of youths doing math, very visually
busy, lots of stuff going on. I thought to myself, I don’t know how
long I could read this book, not because I would jump from thing to
thing, but because I would try to extract the actual math out of it,
and filtering out the rest would be well-nigh impossible, and quite
fatiguing.</p>
<p>Thus, xmonad lets me choose exactly what goes on the screen. Even
<code>xmobar</code>, the system status bar across the top with all that status
information is optional – you can make it so that it appears and
disappears based on a keyboard shortcut, or leave it out altogether. And
certainly, no panel of icons – if I want to start a program, I have a
keyboard combination to start the terminal, another to start the browser,
and another to type in the name of a program I want to run (which I
could also do, of course, from the terminal). The iOS equivalent would
be to have one icon for Safari, and besides that to literally always use
search to find your app, with no icons visible and an empty home screen.</p>
<p>One thing that I like about iOS, however, is also true of xmonad:
when you start a program it takes up the entire screen. For the life
of me, I don’t understand what I ever saw in having different windows
that could overlap on your desktop. What were you doing with the empty
space? Why was it so essential to be able to arrange the screen any way
with enough work? Isn’t it more important to be able to have the screen
in the configuration you want consistently?</p>
<p>In macOS, if I want a window to be full screen, that’s easy enough – but
it’s still not the default, even if it’s the only window. However, if I
want multiple windows to be tiled, then I have to do so many steps. The
cost of the flexibility of freely moving window arrangements around is
that the one I do want is harder.</p>
<p>In xmonad, when I open a window, it takes up the whole screen. If I open
a second window, they split the screen. I can use key combinations to
adjust which one is on which side, or to switch from left-right tiling to
top-bottom tiling, or to move the dividing bar left or right, but most
of the time I can just immediately use it.</p>
<p>I can also use a key combination (⌘-TAB – it would be <code>ALT</code>, but I
have ⌘ generally configured to replace <code>ALT</code>) to switch which window is
focused, but I usually use the mouse for that. I have focus-follows-mouse
enabled, so I don’t actually have to click the mouse before I can start
typing in the newly-focused window.</p>
<p>If I open a third window, then, it works perfectly how I like it:
arranged so I can see all three:</p>
<p><img src="https://www.thecodedmessage.com/three-windows.png" alt="Three Windows"></p>
<p>More than three windows is similar to three – but I don’t let that
happen normally. I stick to three windows per screen, or specifically,
per virtual desktop.</p>
<h1 id="virtual-desktops">Virtual Desktops</h1>
<p>Virtual desktops are a key component of how I use my computer. macOS
has the feature as well, described by the less techie-sounding name of
<a href="https://support.apple.com/guide/mac-help/work-in-multiple-spaces-mh14112/mac">spaces</a>
(and it appears that in that context, it’s also pretty easy to set up
split-screening, which is good news). Virtual desktops are like
having multiple full-screen windows that you switch between, except
that each virtual desktop can have multiple windows on it. In my
context, it means I never have more than three windows on a screen
at a time, but I have multiple sets of three windows that go together
that I can switch between, in my case indexed by number.</p>
<p>If I want to go to virtual desktop 1, I press ⌘-1 (where ⌘ is the
command or logo key, a Windows™ logo on my keyboard even though I
bought this computer from Dell™ with Linux™ pre-installed). To go
to virtual desktop 3, I press ⌘-3. The currently available virtual
desktops are shown on my status bar, with the currently showing one
highlighted in yellow – if they weren’t, I would probably have forgotten
about windows left in other virtual desktops when I first started using
them. In the screenshot above of three windows, you can see that I am
working in desktop 4. There are also windows on desktops 1, 2, and 5,
but none on desktop 3, which is why there is no 3 shown. They go up to
9, or at least 9 that are accessible by that keyboard short-cut in my
current configuration.</p>
<p>If I want to move a window from one virtual desktop to another, I just
need to type <code>⌘-Shift-N</code> while hovering my mouse over the window, where <code>N</code>
is the desktop I want to move it to. Sometimes, the windows come out
in the wrong arrangement on the new desktop, but I can use <code>⌘-Enter</code>
to switch them.</p>
<p>Virtual desktops are key to my workflow and my focus, because each one
corresponds to a mode of using my computer, a type of action. I can
switch between them, but while I’m within one, the only indication that
others are available is up in the status bar.</p>
<p>I use specific virtual desktops for specific tasks on a permanent
basis. When I have not recently been doing the task, there might
be no windows in them, but when I want to do that task, I switch to
that virtual desktop and start windows there. This keeps information
about what is where in my long-term memory, as a fact about how my
system works, rather than in my prospective memory, which <a href="https://www.thecodedmessage.com/posts/write-everything-down">as I’ve
discussed</a> is far more problematic.</p>
<p>To be specific, this is what I use each virtual desktop for:</p>
<h2 id="desktop-1-browsing">Desktop 1: Browsing</h2>
<p>Desktop 1 is a full-screen browser session. If you look at my <a href="https://www.thecodedmessage.com/web.png">web
screenshot</a> (also displayed above), you see that I am on desktop 1.</p>
<p>This is the only place I put a web browser window; you don’t need more
than one because of tabs. I will occasionally also move a terminal or
editor window to this desktop, if I need to type something into the
terminal directly from a web browser, or manually retype text based
off of what I’m reading there, but this is rare. Similarly, I will
occasionally split-screen two web browser windows for the same reason –
but only for as long as I need to see both pages at once.</p>
<p>I don’t use tabs as heavily as some people. I don’t relate to the ADHD
person with hundreds of tabs open. I generally have Slack, e-mail,
and then whatever exact thing I’m using the web browser for. If this
is programming, and I’m reading documentation or troubleshooting an
issue, that might be multiple tabs deep (e.g. of different but related
documentation, or of documentation and source). And occasionally I’ll
absent-mindedly find myself going on a tangent. But besides Slack,
and sometimes e-mail, I close the relevant tabs as soon as I’m done
doing the task – tabs are transient.</p>
<p>When I do read documentation from the web to write code, I do fully switch
desktop environments as I write the code vs reading the documentation.</p>
<p>I don’t like that this is how I access my e-mail. I would prefer to
have it set up with a TUI-based system, while still syncing with the
GMail app on my phone. I know I can do that – I’ve done it before –
but I simply haven’t gotten around to it.</p>
<p>One final note: To help me maintain focus, I do have a blacklist of
websites I don’t let myself go to, implemented through <code>/etc/hosts</code>.
This doesn’t actually restrict me, because I can always go to those
websites on my “unproductive” computer (mostly for Netflix), or on my
phone. They do, however, prevent me from going off the rails and drifting
into a Reddit rabbit-hole when I’m supposed to be working. I can always
unblock a website if I (temporarily or permanently) do need to access
it from one of my primary computers.</p>
<p>Here’s the blacklist, all the domain names that my computer resolves
as referring to <code>localhost</code>, my local computer, rather than the actual
IP address of my server. Here’s all the websites the browser will
therefore fail to connect to:</p>
<pre tabindex="0"><code>127.0.0.1 facebook.com
127.0.0.1 www.facebook.com
127.0.0.1 quora.com
127.0.0.1 www.quora.com
127.0.0.1 twitter.com
127.0.0.1 www.twitter.com
127.0.0.1 news.google.com
127.0.0.1 etrade.com
127.0.0.1 us.etrade.com
127.0.0.1 www.etrade.com
127.0.0.1 reddit.com
127.0.0.1 www.reddit.com
127.0.0.1 news.ycombinator.com
</code></pre><h2 id="desktop-2-coding-primary">Desktop 2: Coding (Primary)</h2>
<p>This is where I look at and edit files in the repo and project that I’m
currently working on. I have a terminal open to the project directory,
and normally two <code>gvim</code> windows – <code>gvim</code> is my preferred text editor –
open to files within that project. The large full-height space is for the
file I’m editing, the smaller space above the terminal for a file I’m
referring to, but within the same project. If I want to edit the other
file instead, I switch them so that <code>gvim</code> window is the new tall one –
there’s a keyboard shortcut for that. The terminal stays on the right,
and in the case of multiple windows on the right, the terminal stays as
the lowest.</p>
<p>I continuously open and close new <code>gvim</code> windows, which is part of why
I use <code>gvim</code> – it loads fast enough for this to be a viable strategy.</p>
<h2 id="desktop-3-coding-secondary">Desktop 3: Coding (Secondary)</h2>
<p>Sometimes, when you’re working on a project, you need to know how
something’s done in a different project. Perhaps you need to know
an implementation detail of a function you’re calling, or maybe just
the interface. Perhaps you know the other project did the thing
you’re trying to do, and you need to see how they did it. Perhaps
you suddenly realized you can’t have X dependency, and now you need
to know if Y depends on X.</p>
<p>Sometimes this is a different internal project, sometimes it’s an
open source project you need to download off GitHub. But it’s
a different repo, with a working copy in a different directory,
and that means that I have a different virtual desktop for it,
with a different terminal in that directory.</p>
<p>There, I mostly do reading, but I can also do editing in a pinch.
For example, if I need to make a change that straddles two repos, the
application (for example) will often be in desktop 2 and the library in
desktop 3. If it straddles 3 repos, I either switch which repo desktop 3
is used for (it is only used for one at a time) or I spill over to
desktop 4 as a tertiary coding repo, as a non-standard use of that
desktop. I usually feel vaguely uncomfortable when I do that, though.</p>
<h2 id="desktop-4-blogging">Desktop 4: Blogging</h2>
<p>I’m on desktop 4 right now as I’m writing this, because that is the
desktop I use for blogging – and most other forms of prose writing
(though specifically not documentation for work, which counts normally
as part of a coding project).</p>
<p>I blog just as I program. I use <code>gvim</code> to edit text files. I use a
terminal to open the right text files, list which text files are present,
keep <code>git</code> up-to-date with what I’m working on, and build and deploy my
blog. In this case, by “build,” I mean translate it from a directory
full of <a href="https://en.wikipedia.org/wiki/Markdown">Markdown</a> files
into a website, which I then upload to my server.</p>
<p>Here’s a screenshot of me editing the markdown for this post in the
left window, and trying and failing to run my build-and-upload script
in the other folder (which refused to upload as I hadn’t synchronized
my files with GitHub yet):</p>
<p><img src="https://www.thecodedmessage.com/markdown.png" alt="Markdown and Upload"></p>
<p>I prefer editing my blog as a bunch of plain text files on my computer.
It gives me a sense of control that I would not get if I installed
Wordpress on my server – or used the official Wordpress. It allows
me to use <code>gvim</code> to edit them as plain text, which I refer to
<a href="https://en.wikipedia.org/wiki/WYSIWYG">WYSIWYG</a> editing.</p>
<p>Generally, I’m only working on one file and so I have a terminal window
and single solitary <code>gvim</code> window, rather than two or three <code>gvim</code>
windows. It only makes sense to work on one file at a time in writing
normally, unlike in programming where there’s intricate mutual references.
Occasionally, for a blog series like this, I will open a previous
part of the blog series to see how much I’m repeating myself.</p>
<h2 id="desktop-5-organization">Desktop 5: Organization</h2>
<p>You might notice, however, that in none of the other desktops do I
describe having any of my organizational files open. I have detailed
organizational files, which I edit in <code>gvim</code>, and discussed in detail
in the <a href="https://www.thecodedmessage.com/posts/my-organizational-system/">previous post</a>. And as
you can see in the sample screenshot from that post (reposted here),
this organizational system lives entirely on desktop 5:</p>
<p><img src="https://www.thecodedmessage.com/org-screenshot.png" alt="Org System"></p>
<p>I do not have the complete list of things I have to do hanging over me
while I’m doing each thing, only when I’m planning. Instead, when I
reach the end of whatever I’m working on – or, as often happens, when
I find I’ve generated a new TODO item that I want to write down but not
yet fully switch my focus to – then I switch to virtual desktop 5 to
interact with my TODO system. When I switch back, with a new task or
with the idea safely written down, I can then (more) fully focus on my
task without worrying about other ones.</p>
<h2 id="desktop-6-signal">Desktop 6: Signal</h2>
<p>When I run the desktop version of signal, which I do sometimes, it runs
on its own virtual desktop, namely desktop 6.</p>
<h2 id="desktop-7-long-running-processes">Desktop 7: Long-running processes</h2>
<p>This is where I put VPN sessions, if they’re tied to a terminal window.
It’s also where I put some very long-running builds or locally hosted
servers.</p>
<h1 id="editor-and-terminal">Editor and Terminal</h1>
<p>I’ve already discussed in a previous session how I use the web browser.
Occasionally, I use a variety of other random graphical programs: an image
viewer, a PDF viewer, or a <a href="https://www.videolan.org/">video player</a>.
But most often, the two types of windows I have open besides the web
browser are <code>gvim</code>, a text editor, and <code>alacritty</code>, a terminal emulator.</p>
<p>Both of these tools are primarily used by computer professionals of
some stripe, so it’s a little unfair of me to bristle when people
see them – also without any icons on the screen – and assume I
am a programmer. I do have specific reasons for using them for
non-programming tasks, that match my habits well, so I’d like
to discuss them further.</p>
<p>Both of them are tools that require substantial investment
in skill. Obviously, to use a terminal, you have to know commands.
You can’t discover the interface like you can with a series of menus,
or settings pages, or icons. Similarly but less obviously, <code>gvim</code>, like
any version of <code>vim</code>, is close to useless to anyone who doesn’t know it.
Both of them require reading documentation in the form of a book (or
website) to explain to you what to do and at least get you started.</p>
<p>But I did all of that investment years ago, as a youth, and it’s
been paying off ever since – to the point where if I try to
edit text, or navigate file systems, without these tools, I feel
substantially hindered.</p>
<h2 id="vim">Vim</h2>
<p>I start with <code>gvim</code> because it’s the more relevant to my organizational
particularities. It’s a text editor, which means that unlike something
like Google Docs or Microsoft Word, it edits plain text files, files
that just have sequences of characters organized into a sequence of
lines. Characters can include Unicode – including accented letters,
Chinese characters and emojis – but not styling like <strong>bold</strong> and
<em>italics</em>.</p>
<p>Text editors are important to programmers because programming is done
via collections of plain text files, and so text editors are universally
useful tools for handling all of them. Rather than each programming
language having its own special file format requiring its own special
editor, text files allow programmers to bring their preferred text editors
with them to a variety of projects, thus allowing a deeper investment
in the skill of using the text editor.</p>
<p>Even this blog, which is not a programming project but a writing
project, is maintained using text files, using Markdown, a format which
interprets <code>*italics*</code> as <em>italics</em> and <code>**bold**</code> as <strong>bold</strong>, and Hugo,
a software package that converts a hierarchy of Markdown-formatted plain
text files appropriately into a website. And for Markdown, just as for
any programming language, I can choose any text editor I want to, and
it will be compatible.</p>
<p>This choice, the choice of text editor, can be greatly personal to a
programmer. The rivalry between two major text editors from earlier
eras of Unix, <code>vi</code> and <code>emacs</code>, was often referred to as a <a href="https://en.wikipedia.org/wiki/Editor_war">holy
war</a> for how intense the fights
about it would get on <a href="https://en.wikipedia.org/wiki/Usenet">Usenet</a>
(an old discussion forum that ran on an <a href="https://en.wikipedia.org/wiki/UUCP">old pre-Internet
network</a>). <code>gvim</code>, which is the
text editor I use, is a form of <code>vim</code>, which is a form of <code>vi</code>, so
I have a definite position in that holy war. And I’m sure I’m going
to hear from people who disagree with my position in response to this
blog post!</p>
<p>While my <code>gvim</code> window looks like a terminal window – and <code>vim</code> can indeed run
inside of a terminal – it’s actually a separate graphical application.
That is what the initial <code>g</code> stands for, “graphical.” When I edit a
file, I want a new window to be opened, and I also want to be able to use
the mouse to click on a location on the screen and move my cursor there.</p>
<p><code>vim</code>, like many of the tools I use, is optimized for expert use, rather
than discoverability by beginners. It’s designed to be a skill to be
invested in: I put in the effort to learn how to use it a long time ago,
and it pays off over a lifetime. The commands I can make from my keyboard
are more powerful than most computer text editing facilities can support,
allowing me to with a few keystrokes perform complex manipulations of the
text.</p>
<p>This is essential, in my mind, for efficient programming, which is
why I put the effort in to learn it. However, it is also particularly
well-suited to my organizational files, which, if you remember from my
<a href="https://www.thecodedmessage.com/posts/my-organizational-system/">previous post</a>, consists of plain
text files with lots of highly-nested bulleted lists, like this
outline for this section of the post:</p>
<pre tabindex="0"><code>* gvim
* Text editor
* Plain text and website generation
* vim
* But not terminal vim
* Still has separate "window"
* And can use mouse if necessary
* Line-based editing good for organization
* Commands work on lines
* Delete
* Paste last delete
* Select multiple
* Shift indentation level
* Org-mode style use of hierarchical bullet points
* Perfect match for those commands
* No notes longer than a line
* Make it more hierarchical instead
</code></pre><p>When I edit plain text files in this format – a custom habit inspired by
<a href="https://orgmode.org/">Org mode</a> but still compatible with Markdown –
it’s important for me to be able to operate on the scale of entire lines.
And operating on entire lines is one of <code>vim</code>’s strongest points!
<code>dd</code> to remove a line, <code>p</code> to insert the line back in, and relevantly
for hierarchical bullet points, <code><<</code> and <code>>></code> to change indentation!
Using <code>V</code>, I can select multiple lines, and then use <code><</code>, <code>></code>, or <code>d</code>
to change indentation or move them! Meanwhile, <code>j</code>, and <code>k</code>, right on the
home row, move down and up through the file, line by line, respectively.</p>
<p>This equates to removing tasks (when they’re done or no longer wanted),
moving tasks between different places in the hierarchy (which I do shockingly
often), removing or adding levels of hierarchy, and other such common
operations on a hierarchical list.</p>
<p>Now, you may wonder how, if typing <code>dd</code> deletes a line, how I type a literal
<code>dd</code>. Well, <code>dd</code> deletes a line in <em>normal</em> mode, but if you type <code>o</code>, it
<strong>o</strong>pens up a new line in <em>insert</em> mode, so that your letters are interpreted
as letters again – until you are done inserting what you had to insert,
and hit <code>[ESC]</code> to return to normal mode.</p>
<p>One of the ways you can tell you’re a proficient <code>vim</code> user is if you keep
the system in normal mode any time you are not literally typing. Typing
tends to be bursty anyway, and evenly interspersed with editing and
navigating – at least in programming, and in my use case, also with
writing.</p>
<p>But it is hard for a newbie. Every once in a while, even I find myself
inserting an editing command as text by accident, or running random
commands trying to type text while I’m actually in normal mode. When
you’re new to <code>vim</code> this happens all the time. It’s decidedly not
beginner-friendly.</p>
<p>But most of your time at a text editor – especially if you’re a programmer
– you won’t be a beginner. And for me, I’m extremely used to it – and
frustrated when I have to write text into a non-vim interface like
Google Docs, or an especially long Slack message. That, and, I do revise
just as often, if not more often, to how much additional text I type –
I need those commands, and the ones I listed are only a brief sample.</p>
<h2 id="terminalcommand-line">Terminal/Command Line</h2>
<p>This is probably the most interesting thing to many of my readers.
Many readers my age or older remember DOS and the DOS prompt, and having
to use the computer from the command line. For some of them, the only
commands they knew were those to launch their games, or to launch other
tools from which they would do their real work – the command line was
fundamentally just a launcher, a menu, albeit one that didn’t list the
options. Others may have simply used it to launch Microsoft Windows,
by typing the <code>win</code> command, a usage pattern so common that Microsoft
made it the premise of Windows 95, and skipped the whole “DOS” step,
even though it was still present as a weird operating system layer and
as a boot stage until Windows XP finally rolled out a modern Windows.</p>
<p>So I have some misconceptions to address about the command line that
come from that perspective.</p>
<p>First, a modern command line is not DOS in a window. It’s certainly not on
Linux or macOS, where it’s more visibly different, but it isn’t even
DOS in modern Windows. The Windows command line might look like the DOS
command line, with its famous prompt <code>C:\></code>, but it is a modern Windows
application that is used to launch modern Windows applications. No DOS
involved, just a different interface mode.</p>
<p>On a related note, the command line, even on Windows but especially
on macOS or Linux, is a modern user interface. It can do things that
involve the Internet. It can make web requests, download and send e-mail,
synchronize files, and do things that DOS couldn’t do.</p>
<p>However, on the flip side, it is not true that the command line
can do everything a graphical user interface can do. It’s comparable,
but it’s simply not identical, as should be obvious if you realize that
it’s impossible to watch a video from the command line. You can use
the command line to <em>launch</em> a video player, but the video player
remains graphical.</p>
<p>And while it is true that the command line allows you more control over
the operating system settings and file system, this is more an accident of
graphical user interfaces trying to be “user friendly” or having limited
room for options, rather than anything intrinsic. You may have heard
of graphical user interfaces described as a layer or façade on top of
the “underlying” command line, but that is a misconception. Graphical
programs and command line programs have the same access to operating
system facilities, except for user interface.</p>
<p>The command line does, however, have a more power user-friendly
aesthetic. Like <code>vim</code>, it requires investment to use effectively –
to use at all. And it is closer to the operating system in that by
convention, it exposes as much control of it as possible, and its
conventions were established in the 70s, before the modern concept of
user-friendliness was really invented. This has been written about at
length in many places, and one of my favorite (book-length) essays
about it is Neal Stephenson’s <a href="http://project.cyberpunk.ru/lib/in_the_beginning_was_the_command_line/">“In the Beginning was the Command
Line”</a>.</p>
<p>Enough about what the command line <em>is</em> (and isn’t)! What do I actually
use the command line <em>for</em>, then?</p>
<p>Well, the command line is an entire interface into the computer, used by
many programs and utilities as the way to interact with them. And I do
use it for basically all of the things I do on the computer that aren’t
web browsing, text editing, or viewing various graphical-only files
(like PDFs, images, or videos), and there’s some variety there.</p>
<p>Primarily, I use the command line for file management. I use the classic Unix
tools for listing files (<code>ls</code> for <strong>l</strong>i<strong>s</strong>t) and navigating directory
hierarchies (<code>cd</code> for <strong>c</strong>hange <strong>d</strong>irectory). I use <code>git</code> to sync code
and writing across computers and make sure it’s backed up somewhere.
I use <code>wc</code> (for <strong>w</strong>ord <strong>c</strong>ount) to see how many lines of code or
words of writing I’ve written. I use <code>bc</code> (<strong>b</strong>asic <strong>c</strong>alculator)
to do back-of-the-envelope math.</p>
<p>I prefer this to graphical file managers. Not only do I not trust them –
I’ve seen Finder crash relatively recently – they change all the time.
And the changes are not good, and usually serve to hide the actual
directory hierarchy and instead impose an organizational system on
you. Instead of seeing directories inside your home directory, you see
stuff like “Music” and “Downloads,” “Documents and “Movies.”</p>
<p>Usually, when I use a graphical file manager, I know where the directory
is in the file system, but then I have to translate it to their list of
commonly used directories, which assumes I keep loads of movies and photos
on my computer, but can have all my “documents,” whether legal documents
or writing projects, in one directory? Where is my home directory? What
if I want to organize my files in a different hierarchy? Can I just navigate
to it from my home directory, please? If you want to put fancy icons on
subdirectories of my home directory based on their names, that’s fine, but
please list <em>all</em> the directories within my home directory, thank you
very much! Not just the pre-defined things you think I ought to have, like
“Music” – this is my work computer, I listen to music on my phone.</p>
<p>So you can see why I prefer the straight-forwardness of the command line
to Finder or Windows Explorer.</p>
<p>I also use the command line to actually do writing and programming work,
not just launching <code>gvim</code> – once I’ve navigated to the file in my complicated
directory system – but also running compilers and build scripts to
turn program source code into programs, and then running those programs,
almost all of which can be controlled entirely from the command line. I
log into other computers I maintain, both embedded devices and servers,
and do work on them. I run scripts that run <code>hugo</code> to turn my Markdown
files into a website and post it on a server.</p>
<p>I also use it for system administration: <code>apt</code> for installing files (I
use Ubuntu – I’m not trying to be a hero of sysadmin) and <code>systemctl</code>
and all of those gnarly commands for other sysadmin stuff. But of course,
the most powerful system administration command is just the text editor –
by editing configuraton files, you can accomplish a lot.</p>
<p>All of this is easier and more focused than if I were using the graphical
equivalent. I write my command and I run it, without having to go through
all the tedious boring steps of a GUI wizard. It’s faster with fewer steps,
with the penalty of accumulated life expertise – which is to say it’s
easy on my <a href="https://www.thecodedmessage.com/posts/write-everything-down/">perspective memory at the expense of my retrospective
memory</a>, which is to say, aligned to how
my brain works.</p>
<p>And yes, I do occasionally have to look up how to do things – though
that’s more in programming than in writing. But having a graphical
user interface doesn’t save you from that, and if you think it does,
you’re fooling yourself. At least when I look up how to do things, I get
suggestions for commands I can directly type in, rather than having to
go through 10 screens and dialog boxes and search them for whatever it
is the poster’s talking about, only to find out I’m using a different
version of the GUI, and that the directions became obsolete in the 2022
edition of Windows 10, or some other such thing.</p>
<p>To reiterate, it turns out that a deep enough hierarchy of dialog
boxes and settings pages is just as complicated as the command line –
but usually less powerful, harder to document, and more subject to
arbitrary change. Just give me the command line!</p>
<h1 id="conclusion">Conclusion</h1>
<p>If I were to summarize some themes of my user interface decisions,
it would be in these three inter-related points:</p>
<ol>
<li>Don’t use condescending, corporatist concepts of “easy to use,” because
they’re more focused on the <em>appearance</em> of ease of use, or most charitably
stated, not intimidating the user, rather than actually making it usable
for an expert user for a wide variety of actual tasks.</li>
<li>Use systems that emphasize the long term power user over the short term
newbie. They will often have a learning curve, but it will pay off.</li>
<li>Use systems that are customizable, so that I can use them my way.</li>
</ol>
<p>But this is <em>all</em> for my work computer, where work is both writing/blogging
and programming. For goofing off, I have a MacBook Air M1, which I use
in macOS as a glorified tablet, and that is perfectly fine for watching
<a href="https://www.thecodedmessage.com/posts/netflix-tech/">Netflix</a> and YouTube.</p>
Rust Is Beyond Object-Oriented, Part 2: Polymorphismhttps://www.thecodedmessage.com/posts/oop-2-polymorphism/2023-02-07T00:00:00+00:00In this post, I continue my series on how Rust differs from the traditional object-oriented programming paradigm by discussing the second of the three traditional pillars of OOP: polymorphism.
Polymorphism is an especially big topic in object-oriented programming, perhaps the most important of its three pillars. Several books could be (and have been) written on what polymorphism is, how various programming languages have implemented it (both within the OOP world and outside of it – yes, polymorphism exists outside of OOP), how to use it effectively, and when not to use it.<p>In this post, I continue <a href="https://www.thecodedmessage.com/tags/beyond-oop/">my series</a> on how Rust
differs from the traditional object-oriented programming paradigm by
discussing the second of the three traditional pillars of OOP:
polymorphism.</p>
<p>Polymorphism is an especially big topic in object-oriented programming,
perhaps the most important of its three pillars. Several books could be
(and have been) written on what polymorphism is, how various programming
languages have implemented it (both within the OOP world and outside of it
– yes, polymorphism exists outside of OOP), how to use it effectively,
and when not to use it. Books could be written on how to use the Rust
version of it alone.</p>
<p>Unfortunately this is just a blog post, so I cannot cover polymorphism in
as much detail or variety as I want to. I shall instead focus specifically
on how Rust differs from the OOP conceptualization. I will start by
describing how it works in OOP, and then discuss how to accomplish the
same goals in Rust.</p>
<hr>
<p>In OOP, polymorphism is everything. It tries to take all decision-making
(or as much decision-making as possible) and unite it in a common
narrow mechanism: run-time polymorphism. But unfortunately, it’s not
just any run-time polymorphism, but a specific, narrow form of run-time
polymorphism, constrained by OOP philosophy and by details of how the
implementations typically work:</p>
<ul>
<li><strong>It requires indirection:</strong> Every object must typically be stored on
the heap for run-time polymorphism to work, as the different “run-time
types” have different sizes. This encourages the aliasing of mutable
objects. Not only that, but to actually call a method, it must go
through three layers of indirection: dereferencing the object reference,
then dereferencing the class pointer or “vtable” pointer, and then
doing an indirect function call.</li>
<li><strong>It precludes optimization:</strong> Beyond the intrinsic cost of an
indirect function call, the fact that the call is indirect means
that inlining is impossible. Often, the polymorphic methods are
small or even trivial, such as returning a constant, setting a field,
or re-arranging the parameters and calling another method, so inlining
would be useful. Inlining is also important to allow optimizations to
cross the inlining boundary.</li>
<li><strong>It is polymorphic in one parameter only:</strong> The special receiver
parameter, called <code>self</code> or <code>this</code>, is the only parameter through which
run-time polymorphism is typically possible. Polymorphism on other
parameters can be simulated with helper methods in those types, which
is awkward, and return-type polymorphism is impossible.</li>
<li><strong>Each value is independently polymorphic:</strong> In run-time polymorphism,
there is often no way to say that all the elements of a collection are of
some type <code>T</code> that all implement the same interface, or to say that
two parameters to a function are the same type but what that type is
should be determined at run-time.</li>
<li><strong>It is entangled with other OOP features:</strong> In C++, runtime
polymorphism is tightly coupled with inheritance. In many OOP
programming languages, it is only available for class types, which
as I discussed in my <a href="https://www.thecodedmessage.com/posts/oop-1-encapsulation/">previous post</a>
are a constrained form of modules.</li>
</ul>
<p>I could write an entire blog post about each of these constraints –
perhaps I will someday.</p>
<p>But in spite of all these constraints, it is seen as the preferred way of
doing decision-making in OOP languages, and as especially intuitive
and accesible. Programmers are trained to reach for this tool whenever
feasible, whether or not it is the best tool for the decision at hand,
even if there is no current need for it to be a run-time decision.
Some programming languages, such as Smalltalk, even collapsed “if-then”
logic and loops into this one oddly specific decision-making structure,
implementing them via polymorphic methods like <code>ifTrue:ifFalse</code> that
would be implemented differently in the <code>True</code> and <code>False</code> classes
(and therefore on the <code>true</code> and <code>false</code> objects).</p>
<p>To be clear, having a mechanism of vtable-based runtime polymorphism
isn’t a bad thing <em>per se</em> – Rust even has one (similar, but not quite
identical, to the OOP version described above). But the Rust version
is used in the relatively rare situations where that mechanism is the
best fit, among a whole palette of mechanisms. In OOP, the elevation of
this tightly constrained and unperformant form of decision making above
all others, and the philosophical assertion that using it is the best
way and most intuitive way to express program flow and business logic,
is a problem.</p>
<p>It turns out that programming is much more ergonomic when you choose
the tool most appropriate for the situation at hand – and OOP run-time
polymorphism is only occasionally the actual tool for the jobs it
is often asked to do.</p>
<p>So let’s look at 4 alternatives in Rust that can be used when OOP
uses run-time polymorphism.</p>
<h1 id="alternative-0-enum">Alternative #0: <code>enum</code></h1>
<p>Not only are there other forms of polymorphism that have strictly
fewer constraints (such as Haskell’s typeclasses) or a different set of
trade-offs (such as Rust’s traits, heavily based on Haskell typeclasses),
there is another decision-making systems in Rust and Haskell, namely
algebraic data types (ADTs), or sum types, that also take over many of
the applications of OOP-style polymorphism.</p>
<p>In Rust, these are known as <code>enum</code>s. <code>enum</code>s in many programming
language are lists of constants to be stored in integer-sized types,
sometimes implemented in a typesafe fashion (like in Java), sometimes
not (like in C), sometimes with either option available (like in C++
with the distinction between <code>enum</code> and <code>enum class</code>).</p>
<p>Rust <code>enum</code>s support this familiar use case, with type-safety:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">Visibility</span> {
</span></span><span style="display:flex;"><span> Visible,
</span></span><span style="display:flex;"><span> Invisible,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>But they also support additional fields associated with each option,
creating what in type theory is known as a “sum type,” but it is
better known among C or C++ programmers as a “tagged union” –
the difference being that in Rust, the compiler is aware of and
enforces the tag. Here’s some examples of some <code>enum</code> declarations:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">UserId</span> {
</span></span><span style="display:flex;"><span> Username(String),
</span></span><span style="display:flex;"><span> Anonymous(IpAddress),
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ^^ This isn't supposed to be a real network type,
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// just an example.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> user1 <span style="color:#f92672">=</span> UserId::Username(<span style="color:#e6db74">"foo"</span>.to_string());
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> user2 <span style="color:#f92672">=</span> UserId::Anonymous(parse_ip(<span style="color:#e6db74">"127.0.0.1"</span>)<span style="color:#f92672">?</span>);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">HostIdentifier</span> {
</span></span><span style="display:flex;"><span> Dns(DomainName),
</span></span><span style="display:flex;"><span> Ipv4Addr(Ipv4Addr),
</span></span><span style="display:flex;"><span> Ipv6Addr(Ipv6Addr),
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">Location</span> {
</span></span><span style="display:flex;"><span> Nowhere,
</span></span><span style="display:flex;"><span> Address(Address),
</span></span><span style="display:flex;"><span> Coordinates {
</span></span><span style="display:flex;"><span> lat: <span style="color:#66d9ef">f64</span>,
</span></span><span style="display:flex;"><span> long: <span style="color:#66d9ef">f64</span>,
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> loc1 <span style="color:#f92672">=</span> Location::Nowhere;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> loc2 <span style="color:#f92672">=</span> Location::Coordinates {
</span></span><span style="display:flex;"><span> lat: <span style="color:#ae81ff">80.0</span>,
</span></span><span style="display:flex;"><span> long: <span style="color:#ae81ff">40.0</span>,
</span></span><span style="display:flex;"><span>};
</span></span></code></pre></div><p>What do these tagged unions have to do with polymorphism, you may ask?
Well, most OOP languages don’t have good syntax for these sum types,
but they do have powerful mechanisms for run-time polymorphism, and so
you’ll see run-time polymorphism used for situations where Rust <code>enum</code>s
would actually be just as well-suited (and I will argue, better suited):
when there’s a few options for how to store a value, but those options
contain different details.</p>
<p>For example, here’s one way to represent the <code>UserId</code> type in Java using
inheritance and run-time polymorphism – how I would’ve done it when I
was a student (putting each class in a different file):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">UserId</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Username</span> <span style="color:#66d9ef">extends</span> UserId <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">private</span> String username<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">public</span> <span style="color:#a6e22e">Username</span><span style="color:#f92672">(</span>String username<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">this</span><span style="color:#f92672">.</span><span style="color:#a6e22e">username</span> <span style="color:#f92672">=</span> username<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ... getters, setters, etc.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">AnonymousUser</span> <span style="color:#66d9ef">extends</span> UserId <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">private</span> Ipv4Address ipAddress<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ... constructor, getters, setters, etc.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>UserId user1 <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> Username<span style="color:#f92672">(</span><span style="color:#e6db74">"foo"</span><span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span>UserId user2 <span style="color:#f92672">=</span> <span style="color:#66d9ef">new</span> AnonymousUser<span style="color:#f92672">(</span><span style="color:#66d9ef">new</span> Ipv4Address<span style="color:#f92672">(</span><span style="color:#e6db74">"127.0.0.1"</span><span style="color:#f92672">));</span>
</span></span></code></pre></div><p>Importantly, just as in the <code>enum</code> example, we can put <code>user1</code>
and <code>user2</code> in variables of the same type, and can pass them to
the same kinds of functions, and in general do the same operations
on them.</p>
<p>Now, these OOP-style classes look super-light to the point of being
silly, but that’s mostly because we haven’t added any real operational
code to this situation – just data and structure and a bit of variable
definitions and boilerplate. Let’s consider what happens if we actually
do anything with user IDs.</p>
<p>For example, we might want to determine whether they’re an administrator.
In our hypothetical, let’s say anonymous users are never administrators,
and users with usernames are only administrators if the username begins
with the string <code>admin_</code>.</p>
<p>The doctrinally approved object-oriented way of doing that is to
add a method, e.g. <code>isAdministrator</code>. In order for this method to
work, we have to add it to all three classes, the base class and
the two child classes:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">UserId</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">public</span> <span style="color:#66d9ef">abstract</span> bool <span style="color:#a6e22e">isAdministrator</span><span style="color:#f92672">();</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Username</span> <span style="color:#66d9ef">extends</span> UserId <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">public</span> bool <span style="color:#a6e22e">isAdministrator</span><span style="color:#f92672">()</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> username<span style="color:#f92672">.</span><span style="color:#a6e22e">startsWith</span><span style="color:#f92672">(</span><span style="color:#e6db74">"admin_"</span><span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">AnonymousUser</span> <span style="color:#66d9ef">extends</span> UserId <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#66d9ef">public</span> bool <span style="color:#a6e22e">isAdminstrator</span><span style="color:#f92672">()</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> <span style="color:#66d9ef">false</span><span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>So, in order to add this simple operation, this simple capability
to this type in Java, we have to go to three classes, which will be
stored in three files. Each of them contains a method that does
something simple, but nowhere can the entire logic be seen of
who is and isn’t an administrator – something that someone might
naturally ask.</p>
<p>Rust would use <code>match</code> for such an operation, putting all the
information about it in one place:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">is_administrator</span>(user: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">UserId</span>) -> <span style="color:#66d9ef">bool</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">match</span> user {
</span></span><span style="display:flex;"><span> UserId::Username(name) <span style="color:#f92672">=></span> name.starts_with(<span style="color:#e6db74">"admin_"</span>),
</span></span><span style="display:flex;"><span> UserId::AnonymousUser(_) <span style="color:#f92672">=></span> <span style="color:#66d9ef">false</span>,
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This yields a more complicated individual function, but it has all the
logic explicitly right there. Having the logic be explicit, instead of
implicit in an inheritance hierarchy, cuts against an OOP precept where
methods should be simple and polymorphism used to express the logic
implicitly. But that doesn’t help guarantee anything, just sweeps it
under the rug: It turns out that hiding the complexity makes it harder
to grapple with, not easier.</p>
<p>Let’s go through another example. We’ve had this <code>UserId</code> code for a while,
and you’re tasked with writing a new web front-end for this system. You
need some way of displaying the user information in HTML, either a link
to a user profile (in the case of a named user) or a stringification of
the IP address in red (in the case of an anonymous user). So you decide
to add a new operation for this small family of types, <code>toHTML</code>, which
outputs your new front-end’s specialized DOM type. (Maybe the Java’s
compiled to WebAssembly, I’m not sure. The details don’t matter.)</p>
<p>You submit a pull request to the maintainer of the <code>UserId</code> class
hierarchy, deep in a core library of the backend. And then they reject
it.</p>
<p>They have pretty good reasons, actually, you grudgingly admit. They’re
saying it’s an absurd separation of concerns. Besides, the company can’t
have this core library handling types from your front-end.</p>
<p>So, you sigh, and write the equivalent of a Rust <code>match</code> expression,
but in Java (please pardon my absurd hypothetical HTML
library):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span>Html <span style="color:#a6e22e">userIdToHtml</span><span style="color:#f92672">(</span>UserId userId<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> <span style="color:#f92672">(</span>userId <span style="color:#66d9ef">instanceof</span> Username<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> Username username <span style="color:#f92672">=</span> <span style="color:#f92672">(</span>Username<span style="color:#f92672">)</span>userId<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> String usernameString <span style="color:#f92672">=</span> username<span style="color:#f92672">.</span><span style="color:#a6e22e">getUsername</span><span style="color:#f92672">();</span>
</span></span><span style="display:flex;"><span> Url url <span style="color:#f92672">=</span> ProfileHandler<span style="color:#f92672">.</span><span style="color:#a6e22e">getProfileForUsername</span><span style="color:#f92672">(</span>usernameString<span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> Link<span style="color:#f92672">.</span><span style="color:#a6e22e">createTextLink</span><span style="color:#f92672">(</span>url<span style="color:#f92672">,</span> username<span style="color:#f92672">.</span><span style="color:#a6e22e">getUsername</span><span style="color:#f92672">());</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span> <span style="color:#66d9ef">else</span> <span style="color:#66d9ef">if</span> <span style="color:#f92672">(</span>userId <span style="color:#66d9ef">instanceof</span> AnonymousUser<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> AnonymousUser anonymousUser <span style="color:#f92672">=</span> <span style="color:#f92672">(</span>AnonymousUser<span style="color:#f92672">)</span>userId<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> Span<span style="color:#f92672">.</span><span style="color:#a6e22e">createColoredText</span><span style="color:#f92672">(</span>anonymousUser<span style="color:#f92672">.</span><span style="color:#a6e22e">getIp</span><span style="color:#f92672">().</span><span style="color:#a6e22e">formatString</span><span style="color:#f92672">(),</span> <span style="color:#e6db74">"red"</span><span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span> <span style="color:#66d9ef">else</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">throw</span> <span style="color:#66d9ef">new</span> RuntimeException<span style="color:#f92672">(</span><span style="color:#e6db74">"IDK, man"</span><span style="color:#f92672">);</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>And this code your boss rejects upon code review, saying you used
the <code>instanceof</code> anti-pattern, but then later they grudgingly
accept it after you make them argue with the maintainer of the core
library that wouldn’t accept your other patch.</p>
<p>But look at how ugly that <code>instanceof</code> code is! No wonder Java programmers
consider it an anti-pattern! But in this situation, it’s the most
reasonable thing, really the only possible thing besides implementing
the observer pattern or the visitor pattern or something else that
just amounts to infrastructure to fake an <code>instanceof</code> with inversion
of control.</p>
<p>Having operations implemented by adding a method to every subclass makes
sense when the set of operations is bounded (or close to it) and the
number of subclasses of the class might grow in unanticipated ways. But
just as often, the number of operations will grow in unanticipated ways,
while the number of subclasses is bounded (or close to it).</p>
<p>For the latter situation, which is more common than OOP advocates would
imagine, Rust <code>enum</code>s – and sum types in general – are perfect. Once
you’ve gotten used to them, you find yourself using them all the time.</p>
<p>I will say for the record that it isn’t this bad in all object-oriented
programming languages. In some, you can write arbitrary class-method
combinations in any order, and so you could write all three
implementations in one place if you so chose. Smalltalk traditionally
lets you navigate the codebase in a special browser, where you can see
either a list of methods implemented by a class, or a list of classes
that accept a given “message,” as Smalltalk calls it, so you can have
your cake and eat it too.</p>
<h1 id="alternative-1-closures">Alternative #1: Closures</h1>
<p>Sometimes, an OOP interface or polymorphic decision only involves one
actual operation. In such a situation, a closure can just be used instead.</p>
<p>I don’t want to spend too much time on this, because most OOP programmers
are already aware of this, and have been since their OOP languages have
caught up with functional languages and gotten syntax for lambdas –
Java in Java 8, C++ in C++11. Silly one-method interfaces like Java’s
<a href="https://docs.oracle.com/javase/8/docs/api/java/util/Comparator.html"><code>Comparator</code></a>
are therefore – fortunately – mostly a thing of the past.</p>
<p>Also, closures in Rust technically involve traits, and so are implemented
using the same mechanism as the next two alternatives, so one could also
argue that this isn’t really a separate option in Rust. In my mind,
however, lambdas, closures, and the <code>FnMut</code>/<code>FnOnce</code>/<code>Fn</code> traits are
special enough aesthetically and situationally that it deserved a little
bit of time.</p>
<p>And so I’ll take the little bit of time to just say this: If you find
yourself writing a trait (or a Java interface or a C++ class) with
exactly one method, please consider whether you should instead be using
some sort of closure or lambda type. Only you can prevent overengineering.</p>
<h1 id="alternative-2-polymorphism-with-traits">Alternative #2: Polymorphism with Traits</h1>
<p>Just like Rust has a version of encapsulation more flexible and more
powerful than the OOP notion of classes, as I discuss in the <a href="https://www.thecodedmessage.com/posts/oop-1-encapsulation">previous
post</a>, Rust has a more powerful version of
polymorphism than OOP posits: traits.</p>
<p>Traits are like interfaces from Java (or an all-abstract superclass
in C++), but without most of the constraints that I discuss at the
beginning of the blog post. They have neither the semantic constraints
or the performance constraints. Traits are heavily inspired in semantics
and principle by Haskell’s typeclasses, and in syntax and implementation
by C++’s templates. C++ programmers can think of them as templates with
concepts (except done right, baked into the programming language from the
get-go, and without having to deal with all the code that doesn’t use it).</p>
<p>Let’s start with the semantics: What can you do with traits that
you can’t do with pure OOP, even if you throw all the indirection
in the world at it? Well, in pure OOP terms, there’s no way
you can write an interface like Rust <code>Eq</code> and <code>Ord</code>, given
greatly oversimplified definitions here (the real definitions
of <a href="https://doc.rust-lang.org/std/cmp/trait.Eq.html"><code>Eq</code></a> and
<a href="https://doc.rust-lang.org/std/cmp/trait.Ord.html"><code>Ord</code></a> extend other
classes that allow partial equivalence and orderings between different
types, but like these simplified definitions, the Rust standard library
version of non-partial <code>Eq</code> and <code>Ord</code> do cover equivalence and ordering
between values of the same type):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">trait</span> Eq {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">eq</span>(self, other: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">Self</span>) -> <span style="color:#66d9ef">bool</span>;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">Ordering</span> {
</span></span><span style="display:flex;"><span> Less,
</span></span><span style="display:flex;"><span> Equal,
</span></span><span style="display:flex;"><span> Greater,
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">trait</span> Ord: Eq {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">cmp</span>(<span style="color:#f92672">&</span>self, other: <span style="color:#66d9ef">&</span><span style="color:#a6e22e">Self</span>) -> <span style="color:#a6e22e">Ordering</span>;
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>See what’s happening? Like in an OOP-style interface, the methods take a
“receiver” type, a <code>self</code> parameter, of the <code>Self</code> type – that is, of
whatever concrete type implements the trait (technically here a reference
to <code>Self</code> or <code>&Self</code>). But unlike in an OOP-style interface, they also
take another argument of <code>&Self</code> type. In order to implement <code>Eq</code> and
<code>Ord</code>, a type <code>T</code> provides a function that takes two references to <code>T</code>.
That’s meant literally: two references to <code>T</code>, not one reference to <code>T</code>
and one reference to <code>T</code> or any subclass (such a thing doesn’t exist
in Rust), not one reference to <code>T</code> and one reference to any other value
that implements <code>Eq</code>, but two bona-fide non-heterogeneous references to
the same concrete type, that the function can then compare for equality
(or ordering).</p>
<p>This is important, because we want to use this to implement methods
like <code>sort</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> Vec<span style="color:#f92672"><</span>T<span style="color:#f92672">></span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">sort</span>(<span style="color:#f92672">&</span><span style="color:#66d9ef">mut</span> self) <span style="color:#66d9ef">where</span> T: Ord {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>OOP-style polymorphism is ideal for heterogeneous containers, where
each element has its own runtime type and its own implementation of
the interfaces. But sort doesn’t work like that. You can’t sort a
collection like <code>[3, "Hello", true]</code>; there’s no reasonable ordering
across all types.</p>
<p>Instead, <code>sort</code> operates on homogeneous containers. All the elements
have to match in type, so that they can be mutually compared. They
don’t each need to have different implementations of the operations.</p>
<p>Nevertheless, <code>sort</code> is still polymorphic. A sorting algorithm is
the same for integers or strings, but comparing integers is a completely
different operation than comparing strings. The sorting algorithm needs
a way of invoking an operation on its items – the comparison operation –
differently for different types, while still having the same overall
structure of code.</p>
<p>This can be done by injecting a comparison function, but many types
have an intrinsic, default ordering, and <code>sort</code> should default to it.
Thus, polymorphism – but not an OOP-friendly variety.</p>
<p>See the contrivance Java goes through to define <code>sort</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#66d9ef">static</span> <span style="color:#f92672"><</span>T <span style="color:#66d9ef">extends</span> Comparable<span style="color:#f92672"><?</span> <span style="color:#66d9ef">super</span> T<span style="color:#f92672">>></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">sort</span><span style="color:#f92672">(</span>List<span style="color:#f92672"><</span>T<span style="color:#f92672">></span> list<span style="color:#f92672">)</span>
</span></span></code></pre></div><p>There is no simple trait that can require <code>T</code> to be comparable to other
<code>T</code>s, for <code>T</code> to be ordered. Instead, as far as the programming language
is concerned, the idea that <code>T</code> is comparable to itself, rather than
to any other random type, is only articulated as an accident to this
method. Nothing is stopping someone from implementing the <code>Comparable</code>
interface in an inconsistent way, like having <code>Integer</code> implement
<code>Comparable<String></code>.</p>
<p>Additionally, when it actually looks up the implementation of
<code>Comparable</code>, it decides what implementation to use based on the first
argument of any comparison, not based on the type. Normally, they
will all be the same type, but theoretically, this list could be
heterogeneous, as long as all the objects “extend” <code>T</code>, and they
could implement <code>Comparable</code> differently. The computer has to do
extra work to indulge this possibility, even though it would
certainly be a mistake.</p>
<p>As we’re now drifting outside of the realm of semantics, and into
the realm of performance, let’s discuss the performance implementations
of this fully.</p>
<p>The Java <code>sort</code> method, as we mentioned, requires every item in the
collection to be a full object type, which means that instead of storing
the values directly in the array, the values are stored in the heap, and
references are stored in the array. This is unnecessary with a traits-based
approach – the values can live directly in the array.</p>
<p>This means that different arrays will have different element sizes, so
this has to be handled by a trait as well. And it is: The size of the
values is also parameterized via the <code>Sized</code> trait. The size does have to
be consistent among all the items of the array, but this is enforceable
because we can express that all the elements are actually the exact same
type – unlike Java’s <code>List<T></code> which only expresses that they’re of type
<code>T</code> or some subtype of <code>T</code>.</p>
<p>Rust’s <code>sort</code> method could have been implemented by passing the size
information (from the <code>Sized</code> trait) and the ordering function (from
the <code>Ord</code> trait) at runtime as an integer value and a function pointer.
This is how typeclasses work in Haskell, which was the inspiration
for Rust traits. This would still be more efficient than the Java,
as there would be a single ordering function, rather than a different
indirect lookup for every left side of the comparison, allowing indirect
branch prediction to work in the processor.</p>
<p>But Rust goes even further than that, and implements its traits instead
via monomorphization. This is similar to C++ template instantiation,
but semantically better constrained. The premise is that while <code>sort</code>
is only one method semantically, in the outputted, compiled code, a
different version of <code>sort</code> is outputted for every type <code>T</code> that it
is called with.</p>
<p>C++ templates create infamously bad error messages and are difficult to
reason about, because they are essentially macros, and awkward ones.
Even Rust cannot create great error messages with its macro system.
But also, writing them requires expertise, and means that the programmer
is forgoing many of the benefits of the type system – templates are often
called, in my opinion rightly so, a form of compile time duck-typing. For
these reasons, template programming in C++ is often considered more
advanced (read as harder and less convenient rather than more powerful)
than OOP-style polymorphism.</p>
<p>In Rust, however, traits provide an organized and more coherent
way of accessing similar technology, getting the performance benefits
of templates while still giving the structure of a solid type system.</p>
<h1 id="alternative-3-dynamic-trait-objects">Alternative #3: Dynamic Trait Objects</h1>
<p>Sometimes, however, you do need full run-time polymorphism. You have
the opposite of the scenario with the <code>enum</code>: You have a closed set of
operations that can be performed on a value, but what those operations
actually do will change dynamically in a way that cannot be bounded
ahead of time.</p>
<p>In such situations, Rust has you covered with the <code>dyn</code> keyword.
Please don’t overuse it, though. In almost all situations where
I’ve thought it might be appropriate, static polymorphism combined
with other design elements have worked out better.</p>
<p>Legitimate use cases for <code>dyn</code> tend to come up in situations involving
inversion of control, where a framework library takes on a main loop, and
the client code says how to handle various events. In network programming,
the framework library says how to juggle all the sockets and register
them with the operating system, but the application needs to say what to
actually do with the data. In GUI programming, the framework code can
say what widget was being clicked on, but very different things happen
if that widget is a button versus a text box versus a custom widget you
invented for this particular app.</p>
<p>Now, you don’t strictly need run-time polymorphism for this. You could
use closures (or even raw function pointers) instead, creating <code>struct</code> of
closures (or function pointers) if multiple operations are called for –
which amounts to basically doing what <code>dyn</code> does the hard way by hand. For
example, I fully expected <code>tokio</code> to use Rust’s run-time polymorphism
feature internally to handle this inversion of control in task
scheduling. Instead, for what I imagine are performance reasons, <code>tokio</code>
implements <code>dyn</code> by hand, even calling its <code>struct</code> of function pointers
<a href="https://github.com/tokio-rs/tokio/blob/a7945b469d634cf205094d8a1661720358622cc0/tokio/src/runtime/task/raw.rs#L13-L43"><code>Vtable</code></a>.</p>
<p>But <code>dyn</code> does all of this work for you, for your trait. The
only requirement is that your trait be object-safe, and the <a href="https://doc.rust-lang.org/reference/items/traits.html#object-safety">list of
requirements</a>
may seem familiar, especially when it comes to the requirements for
an associated function (e.g. a method) to be “dispatchable”:</p>
<blockquote>
<ul>
<li>Not have any type parameters (although lifetime parameters are allowed),</li>
<li>Be a method that does not use <code>Self</code> except in the type of the receiver.</li>
<li>Have a receiver with one of the following types:
<ul>
<li><code>&Self</code> (i.e. <code>&self</code>)</li>
<li><code>&mut Self</code> (i.e <code>&mut self</code>)</li>
<li><code>Box<Self></code></li>
<li><code>Rc<Self></code></li>
<li><code>Arc<Self></code></li>
<li><code>Pin<P></code> where <code>P</code> is one of the types above</li>
</ul>
</li>
<li>Does not have a where <code>Self: Sized</code> bound (receiver type of <code>Self</code> (i.e. <code>self</code>) implies this).</li>
</ul>
</blockquote>
<p>That is to say, it can be polymorphic in exactly one parameter, and that
parameter must be by reference – more or less the exact requirements for
methods to support run-time polymorphism in OOP.</p>
<p>This is of course because <code>dyn</code> uses almost exactly the same mechanism
as OOP to implement run-time polymorphism: the “vtable.” <code>Box<dyn Foo></code>
really contains two pointers rather than one, one to the object in
question, and the pointer to the “vtable,” the automatically-generated
structure of function pointers for that type. The one-parameter
requirement is because that is the parameter whose vtable is used to
look up which concrete implementation of a method to call, and the
indirection requirement is because the concrete type might be different
sizes, with the size only known at run-time.</p>
<p>To be clear, these are limitations on one particular implementation
strategy for run-time polymorphism. Alternative strategies exist that
fully decouple the vtable from individual values of the type, as in
Haskell.</p>
<p>There are still a few advantages of Rust’s version of run-time polymorphism
with traits as opposed to OOP-style interfaces.</p>
<p>Performance-wise, it’s something done alongside a type, rather than
intrinsic to the type. Normal values don’t store a vtable, spreading the
cost of this throughout the program, but rather, the vtables are only
referenced when a <code>dyn</code> pointer is created. If you never create a <code>dyn</code>
pointer to a value of a given type, that type’s vtable doesn’t even have
to be created. Certainly, you don’t have 8 bytes of extra gunk in every
allocation for all the vtable pointers! This also means there’s one
fewer level of indirection.</p>
<p>Semantically, it’s also a good thing that it’s just one option among
many, and that it’s not the strongly preferred option that the entire
programming language is trying to push you towards. Often, even usually,
static polymorphism, enums, or even just good old-fashioned closures
more accurately represent the problem at hand, and should be used
instead.</p>
<p>Finally, the fact that run-time and static polymorphism in Rust both
use traits makes it easier to transition from one system to another.
If you find yourself using <code>dyn</code> for a trait, you don’t have to use
it everywhere that trait is used. You can use the mechanisms of
static polymorphism (like type parameters and <code>impl Trait</code>) instead,
freely mixing and matching with the same traits.</p>
<p>Unlike in C++, you don’t have to learn two completely different sets
of syntax for concepts vs parent classes, and vastly different semantics.
Really, in Rust, dynamic polymorphism is just a special case of static
polymorphism, and the only differences are the things that actually
are different.</p>
The Debt Ceiling Is Unconstitutional, and Biden Should Just Say Sohttps://www.thecodedmessage.com/posts/debt-ceiling/2023-02-02T00:00:00+00:00The validity of the public debt of the United States, authorized by law, including debts incurred for payment of pensions and bounties for services in suppressing insurrection or rebellion, shall not be questioned.
US Constitution, 14th Amendment, Section 4 The debt ceiling is unconstitutional. We’ve let the Republicans play their games for long enough, in the interest of “stability of the economy” and a general fear of rocking the boat, but that time is over now.<blockquote>
<p>The validity of the public debt of the United States, authorized by law,
including debts incurred for payment of pensions and bounties for services
in suppressing insurrection or rebellion, shall not be questioned.</p>
<ul>
<li>US Constitution, 14th Amendment, Section 4</li>
</ul>
</blockquote>
<p>The debt ceiling is unconstitutional. We’ve let the Republicans play their
games for long enough, in the interest of “stability of the economy” and a
general fear of rocking the boat, but that time is over now. President
Biden should simply announce that his administration will not follow
this brazenly unconstituional law, because unconstitutional is literally
what it is, and every Congressperson who wants to use it as leverage is
in flagrant violation of their oath of office.</p>
<p>Often, when people say something is unconstitutional, they mean they
don’t like it, or that a Supreme Court decision they agree with has ruled
it that way, or one that is established deeply in precedent. In this
situation, however, it’s literally in the text of the US Constitution,
a document that we in the US are raised to treat as a sacred.</p>
<p>Let me explain.</p>
<p>Congress has given the President and his administration three sets of
instructions, three policies, three laws:</p>
<ol>
<li><strong>The budget:</strong> To spend X amount.</li>
<li><strong>The tax code:</strong> To tax Y amount.</li>
<li><strong>The debt ceiling:</strong> To not go over Z amount of debt.</li>
</ol>
<p>None of these are optional. President Biden may not unilaterally
cut spending. He certainly may not unilaterally raise taxes. And so,
in our current situation, where we have reached the ceiling, the
only way to follow these instructions from Congress, to spend the
money he is obligated to spend with the tax money he is allowed
to collect, is to default on payments on debt.</p>
<p>This is how the debt ceiling is generally interpreted, as a legal
requirement to default, coming from Congress. That is the only
interpretation that makes sense from the perspective of the House
Republicans who are trying to use this as leverage in a negotiation.</p>
<p>But Congress isn’t allowed to require that. It’s literally
unconstitutional.</p>
<p>I’m not the only person who thinks so. Here’s a sampling of other
articles making the same point, which comes from sources I just
happen to have been reading. The first most matches my perspective:</p>
<ul>
<li><a href="https://statuskuo.substack.com/p/the-deadbeat-limit">The Deadbeat Limit</a></li>
<li><a href="https://www.nytimes.com/2023/01/20/opinion/debt-limit-congress-biden-mccarthy.html">You Can Let Republicans Destroy the Economy, or You Can Call Their Bluff</a></li>
<li><a href="https://www.nytimes.com/2023/01/23/opinion/fourteenth-amendment-debt-ceiling.html">The Constitution Has a 155-Year-Old Answer to the Debt Ceiling</a></li>
</ul>
<p>Of course, the Constitution is only as good as people actually paying
attention to it. Republicans have recently demonstrated repeatedly
that their respect for it is only lip service, that they actually
despise the document.</p>
<p>So if you don’t care about the Constitution, care about the economy –
the US economy and the world economy. Care about the stability of the
US dollar. A default would destroy faith in this country, just as the
authors of the 14th amendment feared. It would result in lawsuits that
would likely invalidate the debt ceiling anyway in the courts – after
the massive economic damage has already been done. The damage would be
unimaginable: We literally have never done something so stupid before,
and have no idea what would happen to the US’s position in the world,
or to the US dollar.</p>
<p>If you’re genuinely worried about the Federal government overspending,
this is not an appropriate forum to express your worries. There
already is a process for that, and it’s called the budget.</p>
My Reaction to Dr. Stroustrup's Recent Memory Safety Commentshttps://www.thecodedmessage.com/posts/stroustrup-response/2023-01-30T00:00:00+00:00The NSA recently published a Cybersecurity Information Sheet about the importance of memory safety, where they recommended moving from memory-unsafe programming languages (like C and C++) to memory-safe ones (like Rust). Dr. Bjarne Stroustrup, the original creator of C++, has made some waves with his response.
To be honest, I was disappointed. As a current die-hard Rustacean and former die-hard C++ programmer, I have thought (and blogged) quite a bit about the topic of Rust vs C++.<p>The NSA recently published a <a href="https://media.defense.gov/2022/Nov/10/2003112742/-1/-1/0/CSI_SOFTWARE_MEMORY_SAFETY.PDF">Cybersecurity Information
Sheet</a>
about the importance of memory safety, where they recommended
moving from memory-unsafe programming languages (like C and
C++) to memory-safe ones (like Rust). Dr. Bjarne Stroustrup, the
original creator of C++, has made some
<a href="https://developers.slashdot.org/story/23/01/21/0526236/rust-safety-is-not-superior-to-c-bjarne-stroustrup-says">waves</a>
with his
<a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2739r0.pdf">response</a>.</p>
<p>To be honest, I was disappointed. As a current die-hard Rustacean
and former die-hard C++ programmer, I have thought (and
<a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">blogged</a>) quite a bit about the topic of Rust vs
C++. Unfortunately, I feel that in spite of the exhortation in his title
to “think seriously about safety,” Dr. Stroustrup was not in fact thinking
seriously himself. Instead of engaging conceptually with the article,
he seems to have reflexively thrown together some talking points –
some of them very stale – not realizing that they mostly are not even
relevant to the NSA’s Cybersecurity Information Sheet, let alone a
thoughtful rebuttal of it.</p>
<p>Fortunately, he does eventually discuss his own ideas of how to make C++
memory safe – in the future. If these ideas are implemented well, it
will make C++ a safe programming language as the NSA’s Cybersecurity
Information Sheet has defined it. But given that they are currently
just proposals in an early stage, it’s unfair of him to expect the NSA
to mention them when advising people on what programming language to
use. C++ has been an unsafe language for a long time. Maybe someday that
will change, but we’ll believe it when we actually see it.</p>
<p>But before I discuss that, I’d like to rebut and discuss my disappointment
at the talking points he uses earlier in his response, because I think
they unfairly frame the debate, shield C++ from legitimate and important
criticism, and slander memory-safe programming languages and downplay
memory safety as a concept, even though it’s very important.</p>
<h1 id="multiple-types-of-safety">Multiple Types of Safety?</h1>
<p>One of the most interesting and conceptually relevant points that
Dr. Stroustrup harps on is that memory safety is not the only
type of safety:</p>
<blockquote>
<p>Also, as described, “safe” is limited to memory safety, leaving
out on the order of a dozen other ways that a language could (and will)
be used to violate some form of safety and security.</p>
</blockquote>
<p>This might technically be true – it’s not entirely clear what other
forms of “safety” he’s talking about – but it’s misleading. Memory
unsafety is not just one of a dozen equally important
forms of “unsafety.” Rather, memory unsafety is by far the
biggest source of security vulnerabilities and instability
in memory unsafe programming languages – estimates as high as <a href="https://www.zdnet.com/article/microsoft-70-percent-of-all-security-bugs-are-memory-safety-issues/">70
percent</a> in some contexts.</p>
<p>A 70% decrease in security vulnerabilities is worth committing
significant resources towards. Memory safety on its own is worth writing
a Cybersecurity Information Sheet about, and it is the area where C++ has
the most serious deficits. Given that, this feels like a car manufacturer
whose cars do not provide air bags responding to a government advisory
not to buy the C++ cars by saying “What about other types of safety?
By talking just about air bags, the government is clearly not thinking
seriously about safety.” Sure, there’s other types of safety features
besides air bags (or memory safety), but air bags are still important!</p>
<p>So, Dr. Stroustrup, what about memory safety in C++? Shouldn’t C++
have memory safety? Are you saying it’s not important, especially when
all of these other programming languages have it?</p>
<p>Of course, he doesn’t go into detail about other types of safety,
which is telling. Of course, it’s because C++ doesn’t really have the
advantage in any of them. For example, Rust also has a lot of mechanisms
for thread safety and type safety, intimately connected with its memory
safety mechanisms, and baked into the design of Rust in a way that would
be next to impossible to retrofit into another programming language.</p>
<p>And, when you read later on about the “safety profiles” in the C++ Core
Guidelines that he makes such a big deal about, most of the focus there
is also about memory safety.</p>
<h1 id="petty-irrelevancies">Petty Irrelevancies</h1>
<p>Let’s look at some of the other points he makes.</p>
<blockquote>
<p>That specifically and explicitly excludes C and C++ as unsafe.</p>
</blockquote>
<p>C++ does not enforce memory safety as a feature of the programming
language. This may change in the future (as Dr. Stroustrup discusses),
but is the current state of things. Dr. Stroustrup tries to downplay this,
but is not convincing.</p>
<blockquote>
<p>As is far too common, it lumps C and C++ into the single category C/C++,
ignoring 30+ years of progress.</p>
</blockquote>
<p>Writing “C/C++” to mean “C and C++” is considered a <em>faux pas</em> among C++
programmers, and among C programmers as well, because it is seen
as asserting that these two programming languages are near-identical
when there are in fact major differences between them. By pointing out
that the NSA does this, Dr. Stroustrup is trying to make them look
like they don’t know what they’re talking about, just because they used
a “/” character instead of the word “and.”</p>
<p>He’s reading too much into the orthography and the NSA’s failure to use
insider <em>shibboleths</em> of the programming languages they’re trying to
criticize. Outside of the “C” and “C++” communities, “C/C++” is a fairly
common way to refer to the two related programming languages.</p>
<p>And that’s the most relevant thing here: C and C++ are indeed related
programming languages, and they have a lot in common: They are both
compiled programming languages with a focus on performance, and they are
(very relevantly) both not particularly focused on guaranteeing memory
safety. C and C++ have a substantial common subset, with many memory
unsafe features that are popular with programmers, perhaps even more
popular because they work similarly in both programming languages. For
the purposes of this document, it’s often the features that C and C++
have in common that are the problematic ones, so it makes sense for the
NSA to lump them together.</p>
<p>While there might be 30+ years of divergence between C and C++, none
of C++’s so-called “progress” involved removing memory-unsafe C features
from C++, many of which are still in common use, and many of which still
make memory safety in C++ near intractible. Sure, new features in C++
have been added that (in some but by no means all cases) do not make it
as easy to corrupt memory, but the bad old features are not in any real
way being phased out: They are not guarded by any special opt-in syntax,
nor in many cases do they result in warnings. Given that, the combined
set of features is as strong as its weakest link.</p>
<blockquote>
<p>Unfortunately, much C++ use is also stuck in the distant past, ignoring
improvements, including ways of dramatically improving safety.</p>
</blockquote>
<p>This is a common C++ talking point, but it doesn’t help Dr. Stroustrup’s
position as much as he thinks it does.</p>
<p>He’s trying to talk up how much C++ has improved, especially in the last
11 years – and it has indeed improved. New ways of writing C++, emphasizing
relatively new features, can indeed result in more reliable C++ code with
less memory corruption.</p>
<p>But unfortunately, this talking point just serves to remind us that these
old memory-unsafe features are still in common use. When someone says
their project is written in Rust, we can guess that it likely uses only
the safe features (including using standard library functions that use
<code>unsafe</code> internally – that truly doesn’t count as unsafe), or maybe
uses the unsafe features when absolutely necessary. But when someone
says their project is written in C++, by Dr. Stroustrup’s own admission,
there’s a high likelihood that it uses old features “stuck in the distant
past, ignoring … ways of dramatically improving safety.” This is also
a reason to avoid C++.</p>
<p>However, I would also contest his claim about these new features.
Memory safety isn’t just an absence of memory corruption, but a reliable
method for ensuring the absence of memory corruption. “Using new features”
isn’t good enough. Even if using the new features in preference to the
old ones were a guarantee of memory safety – which it isn’t, they’re
less memory corrupting but not truly memory safe – the presence of the
old ones would still cause problems. You would need some mechanism to
ensure that the new features were only used safely, and that the old
features were not used, and no such mechanism exists, at least not in
the programming language itself. Someone who remembers the old features
can always still slip up and use one by accident.</p>
<h1 id="static-analysis-not-good-enough">Static Analysis: Not Good Enough</h1>
<p>Dr. Stroustrup points out that he’s been working very hard on improving
memory safety in C++, for a very long time:</p>
<blockquote>
<p>After all, I have worked for decades to make it possible to write better,
safer, and more efficient C++. In particular, the work on the C++ Core
Guidelines specifically aims at delivering statically guaranteed type-safe
and resource-safe C++ for people who need that without disrupting code
bases that can manage without such strong guarantees or introducing
additional tool chains.</p>
</blockquote>
<p>Unfortunately, it’s not done. The key word here is, of course, “aims.” The
next sentences admit that this feature is not in fact available:</p>
<blockquote>
<p>For example, the Microsoft Visual Studio analyzer and its memory-safety
profile deliver much of the CG support today and any good static analyzer
(e.g., Clang tidy, that has some CG support) could be made to completely
deliver those guarantees….</p>
</blockquote>
<p>For memory safety, “much of” is not really good enough, and “could
be made” is practically worthless. Fundamentally, the point is that
memory safety in C++ is a project being actively worked on, and close
to existing. Meanwhile, Rust (and Swift, C#, Java, and others) already
implements memory safety.</p>
<p>It’s worse than that, though. What Dr. Stroustrup is trying to downplay
is that this involves using static analyzers, considered separate from
the programming language, something the NSA’s original article also
discusses. Theoretically, if a static analyzer could be used to guarantee
memory safety, that could be just as reliable as a programming language
that does it. An engineering team could have a policy that all code must
pass this static analysis before being put into production.</p>
<p>But unfortunately, human nature is more fickle than that. If it’s not
built into the programming language, it’s going to get skipped. If a
vendor says their software is written in C++, or if an engineer takes a
job in C++, how will they know that these static analyzers will in fact
be used? A programming language that takes memory safety seriously doesn’t
provide it as an optional add-on that most people will simply ignore.</p>
<h1 id="but-all-the-c-code">But All The C++ Code!</h1>
<p>The end of the last quote provides a common talking point in Rust vs
C++ arguments:</p>
<blockquote>
<p>[Static analyzers] could be made to completely deliver those guarantees
at a fraction of the cost of a change to a variety of novel “safe”
languages.</p>
</blockquote>
<p>Besides the laughably condescending matter of calling Java (which first
appeared in 1995), C# (first appeared in 2000), and Ruby (first appeared
in 1995) “novel,” this is a jab at a common trope that (some immature)
Rust programmers go around demanding that people rewrite their projects
in Rust (please don’t do this!), and an attack on the idea that all code
can be written in safe programming languages, given the large body of
existing work in unsafe programming languages.</p>
<p>This is a bit of a straw man in this context. The NSA article that
Stroustrup is responding to addresses that switching existing codebases
might be expensive, even prohibitively so, saying:</p>
<blockquote>
<p>It is not trivial to shift a mature software development infrastructure
from one computer language to another. Skilled programmers need to be
trained in a new language and there is an efficiency hit when using a
new language. Programmers must endure a learning curve and work their
way through any “newbie” mistakes. While another approach is to
hire programmers skilled in a memory safe language, they too will have
their own learning curve for understanding the existing code base and
the domain in which the software will function.</p>
</blockquote>
<p>It then follows this up immediately with an explanation of how tools
like static analyzers can be used as a back-up plan for improving
memory safety in memory unsafe programming languages – exactly what
Dr. Stroustrup discusses. He’s criticizing this NSA document, implying
it is not thinking “seriously,” while fundamentally making a point
that they already made for him.</p>
<p>Of course, this is a terrible endorsement of C++. It’s far from ideal
to have to use add-on tools to work around a language’s flaws. Coming
from Dr. Stroustrup, it reads more like a brag that his programming
language has locked everyone in than a defense of why C++ is good.
Or else, it’s an admission that other programming languages should
be used for new projects, and that C++’s fate is now to gradually
fade like the elves from Middle Earth.</p>
<p>But he’s also overstating his case. As I mention before, safe programming
languages have existed for a long time. Many programming projects that
in the early 90’s would have been done in C or C++ <em>have</em> in fact been
done in safe programming languages instead, and according to the NSA’s
recommendation, that was a good idea. As computers have gotten faster
and programming language technology has improved, there has been
fewer and fewer reasons to settle for languages like C or C++ that
don’t have memory safety as a feature.</p>
<p>When I was a professional C++ programmer as early as 2013, some people
– even some programmers – already thought that C++ was a legacy
programming language like COBOL or Fortran. And outside of narrow
niches like systems programming (e.g. web browsers, operating systems,
and lower-level libraries), video games, or high performance programming,
it kind of has become one. The former application niches of C++ have
been taken over by Java and C#, or more recently by Go. If you have an
application program written in C++, chances are that it’s a relatively
old codebase, or written at a shop that has reasons to write a lot of C++
(such as a high-frequency trading firm).</p>
<p>Now, even C++’s systems niche is under threat, with Rust, a powerful
memory-safe programming language that avoids many of C++’s problems. Now,
even the niches where C++ isn’t at all “legacy” have a viable, memory-safe
alternative without a lot of the technical debt that C++ has. Rust is
even allowed in the Linux kernel, a project that has only previously
accepted C, and whose chief maintainer has always <a href="http://harmful.cat-v.org/software/c++/linus">explicitly hated
C++</a>.</p>
<h1 id="a-memory-safe-c">A Memory-Safe C++</h1>
<p>Fortunately, after all of these ill-thought out, tired talking points,
Dr. Stroustrup subtly changes his perspective. After his distractions,
after bashing memory safe programming languages as “novel,” bragging about
how C++ is too entrenched to be removable, pretending memory safety is
just one of many equally important safety issues, and promising optional
add-on tools that will eventually be standardized, he finally begins to
tackle the question of how C++ could be made memory safe, in an opt-in
fashion:</p>
<blockquote>
<p>There is not just one definition of “safety”, and we can achieve a
variety of kinds of safety through a combination of programming styles,
support libraries, and enforcement through static analysis. P2410r0
gives a brief summary of the approach. I envision compiler options
and code annotations for requesting rules to be enforced. The most
obvious would be to request guaranteed full type-and-resource safety.
P2687R0 is a start on how the standard can support this, R1 will be more
specific. Naturally, comments and suggestions are most welcome.</p>
<p>…</p>
<p>For example, in application domains where performance is the main
concern, the P2687R0 approach lets you apply the safety guarantees
only where required and use your favorite tuning techniques where
needed. Partial adoption of some of the rules (e.g., rules for
range checking and initialization) is likely to be important. Gradual
adoption of safety rules and adoption of differing safety rules will be
important. If for no other reason than the billions of lines of C++ code
will not magically disappear, and even “safe” code (in any language)
will have to call traditional C or C++ code or be called by traditional
code that does not offer specific safety guarantees.</p>
</blockquote>
<p>This is a lot closer to what the NSA document actually specifies for
memory safe programming languages than he gives the document credit
for. For example, the document already provides for opting out of memory
safety via annotation, paired with an observation that that will
focus scrutiny on the code that opts out.</p>
<p>Dr. Stroustrup did not need to criticize the document for not thinking
“seriously” to reach this conclusion, but simply acknowledge that it’s
true that C++ is not a memory safe programming language yet, but that
based on his work, it might soon become one. Maybe the next version
of the NSA document will endorse using C++, but only if it’s C++<em>ZZ</em> –
where <em>ZZ</em> is some future version of the C++ standard.</p>
<p>I’m glad comments and suggestions are welcome, however, because
I have a huge one.</p>
<p>Opt-in for memory safety is unacceptable, and is almost as bad as having
a separate static analysis tool to enforce safety. Opt-out is fine –
Rust has a way to opt out of memory safety with the <code>unsafe</code> keyword, and
this concept is discussed and defended in the NSA’s original document. But
the default should be to enforce memory safety unless otherwise specified.</p>
<p>For C++, this means that if these safety features are added in C++<em>ZZ</em>,
<code>--std=c++ZZ</code> should cause unsafe constructs to be rejected – and the
C++ standard should require that these constructs be rejected for an
implementation to be a conforming implementation of C++<em>ZZ</em>. Perhaps (but
only perhaps) other command line arguments could be added to override
this constraint on a file-by-file basis. Ideally, a new compiler command
(e.g. <code>g++ZZ</code>) should be created for each implementation that defaults
to this stricter behavior.</p>
<p>Parts of the codebase that use legacy features should have to
have at least a file-level annotation that that file is a legacy file –
and then this annotation could gradually be moved to the function level.
As a side benefit, this could also be used to phase out and deprecate
weird points of C++ syntax, similar to the Rust edition system: Anyone
using, for example, 0 literals to mean <code>nullptr</code> would have to declare
some sort of a legacy annotation on their file or in their build system.</p>
<p>Only with this sort of opt-out memory-safety system would I consider
C++ a memory safe programming language. I’d be very happy to see a
memory-safe C++. I earnestly hope Dr. Stroustrup is successful in his
endeavors. I’m not holding my breath, though, and in the meantime,
I will continue to use other programming languages, that are already
memory-safe, for my new projects, as will the majority of programmers.</p>
<p>In the meantime, it is unfair for Dr. Stroustrup to call safe programming
languages novelties or to pretend that C++ isn’t already far behind the
times on this. This was already an important criticism of C++ decades ago,
when Java first came out in the 90’s and was referred to as a “managed
programming language.” This was discussed in detail in my classes when
I was a college student in the late aughts. To read Dr. Stroustrup’s writing,
C++ is being criticized by “novel” upstarts when it is well on its way to
getting the feature, but in actuality, the time to act was 1996.</p>
Complexities of Defining ADHDhttps://www.thecodedmessage.com/posts/adhd-philosophy/2023-01-18T00:00:00+00:00ADHD is a controversial topic, and it’s never been more relevant. Diagnoses are soaring right now, driven up by a variety of interacting forces. Open discussion about ADHD – and the related general concept of “neurodiversity” – has been exploding on the Internet. And recently, there’s been a very unfortunate Adderall shortage.
So I wanted to take an opportunity to share some thoughts about it. I would say that I was taking this opportunity to clear things up, but unfortunately, that might not be possible.<p>ADHD is a controversial topic, and it’s never been more
relevant. Diagnoses are soaring right now, driven up by a variety of
interacting forces. Open discussion about ADHD – and the related general
concept of “neurodiversity” – has been exploding on the Internet. And
recently, there’s been a very unfortunate Adderall shortage.</p>
<p>So I wanted to take an opportunity to share some thoughts about it.
I would say that I was taking this opportunity to clear things up,
but unfortunately, that might not be possible. The reality is a really
muddy situation, and many people’s mental models – including many
professionals’ – are oversimplifications.</p>
<p>This is unfortunate because ADHD is an important issue, not
just in childhood, but in adulthood as well. It is prevalent:
according to one study, it affects <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2859678/">4.4% of adults in the
US</a>,
and according to another, <a href="https://pubmed.ncbi.nlm.nih.gov/27866355/">2.8% of adults
globally</a>
(numbers can vary greatly, for reasons we’ll discuss). ADHD
can, especially if untreated, cause severe adverse
life outcomes, including up to a <a href="https://www.additudemag.com/adhd-life-expectancy-russell-barkley/">13 year decrease in life
expectancy</a>.
Treatment for ADHD – especially stimulant medications – is very
effective, and access to it is an urgent matter for those who need it.</p>
<blockquote>
<p><strong>Aside: ADHD: A Misnomer</strong></p>
<p>There are a lot of misconceptions about it that cause people to think
ADHD is less severe than it actually is, many to do with the name.
ADHD, or Attention Deficit Hyperactivity Disorder, is named for the two
traits that bother parents and teachers the most when they manifest in
children: inattention and hyperactivity. While they are important in a
classroom or disciplinary setting, they are not the actual core symptoms,
or the symptoms that cause people with ADHD – especially adults but
also children – the most trouble. And I will focus on adults with ADHD
in this post, because I am an adult with ADHD.</p>
<p>So what are the actual core symptoms?
Dr. Russell Barkley, one of the leading experts on ADHD, considers
ADHD to be a misnomer. He summarizes it instead as an “Executive
Function Deficit Disorder” because its core symptom is difficulty with
executive functions, which he lists and explains in more detail in <a href="https://www.additudemag.com/7-executive-function-deficits-linked-to-adhd/">this
article</a>,
essential reading to understanding ADHD better.</p>
<p>In a terminological distinction of questionable value, ADHD is considered
a <em>neurodevelopmental disorder</em> like dyslexia or autism, which are
considered distinct from a <em>mental illness</em> like anxiety or depression.
Disorders of both categories are documented in the DSM, or <em>Diagnostic
and Statistical Manual of Mental Disorders</em>. ADHD, while not itself
considered a form of mental illness, does lead to an increased likelihood
of developing a mental illness.</p>
</blockquote>
<p>ADHD is a serious and relatively prevalent condition in adults, so it’s
fortunate that such effective treatments exist, and that it has been
able to be studied as well as it has been. Unfortunately, its causes
are poorly understood, and even defining what ADHD is or what it means
for someone to have ADHD can be surprisingly difficult.</p>
<p>In this post, I intend to explain why ADHD is so difficult to define,
and explore some of the consequences of that difficulty.</p>
<h1 id="competing-approaches-to-defining-adhd">Competing Approaches to Defining ADHD</h1>
<p>Let me start with an example: Trauma, especially Complex Post-Traumatic
Stress Disorder (CPTSD), can have a lot of the same symptoms as ADHD. It
can cause difficulty with executive function, which is the core symptom
of ADHD. Specifically, it can cause trouble staying on task, keeping
track of responsibilities and physical objects, and restlessness –
all classic ADHD symptoms.</p>
<p>But how to think of that is something of a philosophical question: Is
ADHD a pattern of symptoms with common coping skills and treatments?
In that case, we could say that CPTSD can cause ADHD. Or is ADHD an
attempt to figure out an underlying specific brain disorder? In that
case, we wouldn’t want to say that trauma “causes ADHD.”</p>
<p>This question comes up surprisingly often; this connection between CPTSD
and ADHD is just one example. It is a surprisingly nuanced question, and
I’ve seen ADHD (and its connection to CPTSD) framed both ways by reliable
sources. I don’t think it necessarily has a clear answer. At a certain
point, it can feel like “arguing over semantics.” But it is important,
because we need some way of categorizing and discussing people’s brains,
if only to provide treatment.</p>
<p>In practice, the answer may depend on context. For a therapist teaching
coping skills, it might be easier to think about it as “trauma causes
ADHD,” and then teach the ADHD coping skills. For a psychiatrist, the
underlying causes may (or may not) be more relevant, depending on how
much it influences the effectiveness of various medications; treating
the CPTSD with typical CPTSD medications (such as anti-depressants or
mood stabilizers) might (or might not) be a better way of treating
the ADHD-like symptoms, rather than prescribing an ADHD medication
like Adderall.</p>
<p>Intuitively, it seems obvious: Split them up. We like to think of ADHD as
a neat and tidy disorder, one that you’re born with, one that’s genetic.
CPTSD is acquired, and has drastically different causes than a typical
ADHD case. It seems obvious that different causes should mean different
disorders. And if some of the same techniques are helpful, therapists can
think of the ADHD traits caused by CPTSD as just that: “ADHD traits.” And
if some of the same medications are helpful, we can just say something
along the lines of “in some cases ADHD medications can help with CPTSD.”</p>
<p>But it’s harder than you might think to fully avoid basing the
definition on the symptoms. For all the definitions in the world,
in practice “people with ADHD” means “people diagnosed with ADHD” –
and ADHD is diagnosed based on the symptoms. While research into the
causes and underlying neurological mechanisms have made great strides,
the best diagnostic tools we have don’t involve brain scans or genetic
tests. Instead, you have to use some combination of surveying and
interviewing the patient, surveying or interviewing people the patient
knows, or doing cognitive tests to see if the patient is in fact impaired
in those areas of cognition that ADHD makes more difficult.</p>
<p>All of these involve investigating symptoms, not causes. And ADHD
diagnosis also requires that these symptoms actually cause problems.
To quote the standard DSM (<em>The Diagnostic and Statistical Manual of Mental
Disorders</em>), an ADHD diagnosis has this absolute criterion:</p>
<blockquote>
<p>D. There is clear evidence that the symptoms interfere with, or reduce
the quality of, social, academic, or occupational functioning.</p>
</blockquote>
<p>This all paints a picture of a definition – or at least a diagnostic
process – based on the symptoms. And yet, the DSM also includes
a criterion that points in the direction of ADHD being a discrete
disorder, rather than a collection of symptoms:</p>
<blockquote>
<p>E. The symptoms do not occur exclusively during the course of
schizophrenia or another psychotic disorder and are not better explained
by another mental disorder (e.g., mood disorder, anxiety disorder,
dissociative disorder, personality disorder, substance intoxication
or withdrawal).</p>
</blockquote>
<p>This would be more straight-forward if ADHD had a known,
specific cause. If there were a single known mutation
that caused ADHD – like the ones known to cause <a href="https://en.wikipedia.org/wiki/Down_syndrome">Down
syndrome</a> or <a href="https://en.wikipedia.org/wiki/Fragile_X_syndrome">Fragile X
syndrome</a> – it would
be clear: you would either have ADHD or you don’t, based on whether you
had that genetic abnormality.</p>
<p>But though there has been some research on the genetics of ADHD, it
is far from definitive or conclusive. In fact, ADHD isn’t even 100%
determined by genetics. Instead, it has an estimated <a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6477889/">heritability of
77-88%</a>, which is
a measure of how likely it is for a person with ADHD’s identical twin to
also have ADHD, and related to how likely it is for their other relatives
to have it.</p>
<p>And to be clear, heritability is difficult to study (and thus the wide
range): It’s hard to control genetic factors from environmental ones,
when you necessarily have to consider people who are blood-related
to each other. Furthermore, this doesn’t mean that one “ADHD gene”
represents all of this heritability – there likely are many different
genes lumped in there, all causing (potentially different flavors of)
ADHD, with different levels of heritability that then work out to a
weighted average of 77-88%. And sometimes, related people might both
have ADHD traits by coincidence, and that’s also hard to control for:
How easy is it for a study to verify that the specific executive function
deficits experienced by these relatives are similar?</p>
<p>Non-genetic risk factors also abound: people born
prematurely are <a href="https://pubmed.ncbi.nlm.nih.gov/21289534/">twice to three times as likely to develop
ADHD</a>. Modern ADHD research
got its start with yet another risk factor: children recovering from
Spanish Flu started having drastic behavioral shifts, that were then
called “Minimal Brain Damage” or “Minimal Brain Dysfunction,” which
was ultimately renamed ADHD. This has <a href="https://www.hcplive.com/view/adhd-focus-concerns-covid-19-infection">come to more public attention
recently</a>
due to concerns with long COVID. (That article contains quotes from
Russell Barkley, the leading expert on ADHD mentioned above when
discussing executive functions.)</p>
<p>Given this diversity of causes, and our substantial but still incomplete
understanding of the neurological mechanisms, do we have the knowledge
necessary to think of ADHD as a disorder of symptoms, rather than
a list of symptoms that tend to correlate with each other? I am far
from the first person to consider this; some have even gone as far
to propose that the word “disorder” <a href="https://www.scientificamerican.com/article/we-need-to-rename-adhd/">be removed from the acronym as
misleading</a>.</p>
<p>So, returning to the CPTSD example, perhaps it makes more
sense to add one more possible cause. Perhaps saying that CPTSD can
cause ADHD traits is equivalent to saying that CPTSD causes ADHD,
because “ADHD traits” is all the traction we have on this disorder.</p>
<blockquote>
<p><strong>Aside: Obligatory Caveat</strong></p>
<p>I say “perhaps” for a reason. One reason not to would be if there
were discernible differences, especially in treatment, between ADHD as
caused by CPTSD, and other cases: for example, differences in effective
medication.</p>
<p>And about medication I offer no opinion. I’m not a
psychiatrist, nor am I any sort of expert on CPTSD – that is
not the cause of my ADHD, personally. It’s a very complicated
issue, especially because <a href="https://www.additudemag.com/adhd-ptsd-fear-circuit-deficits/">the causal arrow might point both
directions</a>,
which is to say that recent studies have shown that not only can CPTSD
cause ADHD traits, but ADHD is a risk factor in developing CPTSD. For
any individual case, it may not be clear which came first.</p>
<p>As you can see, ADHD is not a simple, tidy disorder at all (and neither
is CPTSD) and that, more than any particular position, is what I want
you as a reader to take away from this.</p>
</blockquote>
<p>Of course, we might someday discover a crystal-clear ADHD gene, or perhaps
a couple of them. This would mean essentially that we were discovering new
disorders, disorders that were previously all lumped under the umbrella
label of ADHD. Once this happens, we’d have to decide as a society what
to do to realign our labels.</p>
<p>And how to realign the labels will depend on the nature of the discovery.
Perhaps one gene would cover the majority of people with ADHD,
and therefore it might keep the name, with a new, objective, genetic
test. The others would be considered “ADHD-like.” Or perhaps, a smaller
group of patients would be covered under a narrower disorder, and then we
would say things like “this used to be considered a type of ADHD, but
now we know better.”</p>
<p>This may seem far-fetched, but it has already happened with
autism. Similarly to ADHD, autism is primarily diagnosed based on
symptoms. But there are genetic disorders, like Fragile X and especially
<a href="https://en.wikipedia.org/wiki/Rett_syndrome">Rett syndrome</a> with
substantial overlap in symptoms to autism. Rett syndrome in particular
used to be categorized alongside autism in the DSM, as a “pervasive
developmental disorder” alongside Asperger syndrome and autism proper –
basically as one of several parts of the “autism spectrum.” But when
its genetic and neurological mechanisms were discovered, it was removed
from that section, and from the DSM entirely.</p>
<p>Perhaps, as more and more discrete causes of autism are discovered, this
will happen more and more. Autism is currently a very large umbrella,
appropriately termed a “spectrum,” covering profoundly disabled adults
who cannot take care of themselves, and mostly functional adults who
simply exhibit some levels of social and executive function difficulty.</p>
<blockquote>
<p><strong>Aside: Are disorders with multiple causes “real”?</strong></p>
<p>One popular conclusion to draw from the muddied and ill-understood
causes of ADHD is that ADHD is not real. One example of this is the
fringe book <em>ADHD Does Not Exist</em> by Richard Saul, a neurologist who
wrote this book with no backing from wider research, and who is not
widely recognized as legitimate among ADHD experts. Nevertheless, the
book is popular in some circles, and Richard Saul has gotten traction
with some parents and teachers (and unfortunately even some doctors),
and even wrote an opinion piece in <em>Time Magazine</em>.</p>
<p>In the book’s blurb, it says that “ADHD is actually a cluster of
symptoms stemming from over 20 other conditions or disorders” – a
statement that may be tempting to believe, given what I’ve said above,
but is ultimately deeply misleading. So I thought I’d spend some time
picking this argument apart.</p>
<p>First, we don’t know a complete list of what disorders can “cause ADHD,”
but there’s lots of evidence that ADHD is primarily genetic. Whatever
other problems the gene(s) involved may cause, and whatever shifts
in categorization may be brought about by further research, there
are definitely genes that do cause ADHD symptoms.</p>
<p>There are also definitely many people whose primary set of symptoms that
raise to the level of needing psychiatric care are exactly that set of
symptoms, the typical ADHD. Whether this set of symptoms is caused by one
gene or many, and whether it is caused by genes alone or a combination
of genes and environment, or even sometimes by environment alone, it is
a real occurrence and a real problem that often occurs on its own.</p>
<p>That is enough to make ADHD exist.</p>
<p>But more importantly, this set of symptoms, whatever genes mediate
them and whatever variety of causes they have, can be extremely
debilitating. People need treatments for it now, and it has well-proven,
well-studied, extremely effective treatments, especially medicinal
ones. Even if ADHD were primarily caused by other identifiable psychiatric
disorders, that would not mean setting aside ADHD medications. They would
still be effective for all the people they’re currently effective for.</p>
<p>If anything, this perspective should make us study expanding the use of
ADHD treatments and medications to situations where the symptoms can be
said to be caused by other disorders, rather than give up on them for
everyone in hopes of finding the “proper” treatment for the “underlying”
disorder for every individual.</p>
<p>Of course, the popularity of this book and its flawed line of thinking
is easy to explain: Many people have already made up their mind that
ADHD medications are problematic, and are looking for any excuse to get
rid of them. Motivated reasoning abounds.</p>
</blockquote>
<h1 id="what-to-do-with-the-connection-between-adhd-and-autism">What to do with the connection between ADHD and autism?</h1>
<p>I want to return to the topic of autism. As I mentioned, autism often comes
with some level of deficit in executive function. That throws a wrinkle
into the definition of ADHD, because a deficit of executive function is
the core symptom of ADHD, the summary or cause of all the other symptoms.</p>
<p>Given this, it’s not surprising that many children and adults seem
eligible for both diagnoses. In the past, following a model of ADHD where
it was considered a discrete disorder with its own particular causes
(even though we don’t understand them), practitioners had to choose one. A
diagnosis of autism (or the then-separate diagnosis of Asperger syndrome)
could explain any and all ADHD symptoms, so if both sets of diagnostic
criteria were met, Asperger syndrome was the one chosen.</p>
<p>But that was changed in the most recent edition of the DSM, the DSM
V, so now both diagnoses are possible in the same person. This has
been a great step forward pragmatically: It has allowed children and
adults who exhibit traits from both disorders to get access to better
treatment, especially stimulant ADHD medications, which are among the
most consistently effective psychiatric treatments modern medicine has
ever developed.</p>
<p>But this has also led to some surprising, and philosophically challenging,
results. Now that it’s possible for a person to have both diagnoses,
we have found a huge amount of comorbidity, which is correlation between
two disorders – an amount of comorbidity that leads to questions about
whether we’re categorizing these disorders correctly.</p>
<p>According to a
<a href="https://www.sciencedirect.com/science/article/pii/S1750946721000349?casa_token=oj7BsJQIlh4AAAAA:faivFX3KtMAdtz0eKpRhwsHncXD-gvsFWiWmBTNx81U2R_pEQIO-grbuLaGI_zwzI-CuXzIzrQ">meta-analysis</a>
50-70% of those with a diagnosis of ASD (autism spectrum disorder)
also meet the criteria for a diagnosis of ADHD. Many experts believe
the number should be even higher. Some even believe that all autism
cases cause executive dysfunction and therefore can be expected to lead
to ADHD symptoms – and that therefore we should no longer allow concurrent
diagnoses.</p>
<p>In the other direction, we necessarily see lower numbers, because ASD
is less common than ADHD. Still, around 20%-30% of people with ADHD are
diagnosable with autism, especially if it is specifically screened for.
Given that ADHD is about 2 to 2.5 times as prevalent as autism (depending
on the studies used), these are the numbers we’d expect mathematically. The
connection may be even stronger if we consider the prevalence of specific
autism symptoms, such as sensory sensitivity, in cases where a full
autism diagnosis isn’t indicated.</p>
<p>So what should we do with this? Given that both ADHD and autism have
unclear and diverse causes, we treat them in practice, if not always
in theory, as correlated symptoms or traits. But if they’re also
correlated with each other, as they seem to be, then what basis do we
have for separating them? Should we merge them into one disorder? Given
the relative prevalences, should we consider autism to be a more severe
form of ADHD? A more narrowly defined subset of it?</p>
<p>If we were to combine them, it wouldn’t be the first time two disorders
were merged – even ones that might seem drastically different from the
outside. The DSM V merged autism and Asperger syndrome into one diagnosis,
“autism spectrum disorder.” And ADHD used to be considered distinct from
the non-hyperactive ADD, but now it is just one disorder, ADHD, which can
then be subdivided hyperactive, inattentive, or combined “presentations.”</p>
<p>Many on social media have already made up their mind, and rushed ahead
of the experts. One particular Instagram post asked, “Is ADHD on the
autism spectrum?” In spite of the stereotype that all articles titled
with a question can be summarized as “no,” the linked article gave an
enthusiastic, almost gleeful “yes.”</p>
<p>I commented that if ADHD and autism are connected, there might be a
better way to express this connection than saying “ADHD is on the autism
spectrum.” In fact, given that ADHD is the more common diagnosis, perhaps
it would be more accurate to say that autism is on the ADHD spectrum –
and probably less stigmatized at that. For this, I was yelled at for being
an ableist. Ah, the folly of writing things on the Internet (says Jimmy,
while writing a blog post on the Internet).</p>
<p>Less controversially, many have adopted the term <em>neurodivergent</em> as a
<em>de facto</em> umbrella term for autism and ADHD – and other disorders,
like CPTSD, that share traits with them. This term originated from
autism advocacy, to shift from a model where such disorders are
treated as pathologies to a model where they are treated as differences,
fully natural and possibly even beneficial.</p>
<p>Theoretically, the term <em>neurodivergent</em> is meant to include anyone
whose brain substantially differs from the brains of average – or
<em>neurotypical</em> – people. If this theoretical definition is given
credence, especially when coupled with an insistence that it doesn’t
have to refer to a pathology, it can become dizzyingly broad almost
to the point of meaninglessness. Are left-handed people neurodivergent?
Are people with anxiety? Are people with extraordinary talents, even
when not coupled with any symptoms of any recognized disorder? If
the definition is broad enough, then there won’t be a substantial
number of neurotypical people left! Does that make the term meaningless?
Or is that, in fact, the point?</p>
<p>But in my experience, the term primarily seems to be used to describe the
nebulous space of traits with substantial overlap with autism and ADHD –
such as autism and ADHD themselves, and disorders with significant symptom
overlap, like sensory integration disorder (SID), and, of course, CPTSD.</p>
<p>This serves a practical purpose: It allows people to share advice, common
experiences, and coping mechanisms without getting into the trouble
of playing the game of which specific diagnosis they’re for. And while
sometimes there’s glitches (such as universal human experiences being
depicted as “neurodiverse” experiences), overall, this is a helpful thing.</p>
<p>But while the Internet has addressed the terminology problem appropriately
for the goals of sharing empathy and coping skills, professionals still
have to deal with the panoply of diagnoses. For them, there are many
practical questions to wrestle with:</p>
<ul>
<li>Should a person with both ADHD and autism traits be diagnosed with both?</li>
<li>Should they be diagnosed with autism only, out of philosophical reasons,
as was required in the 90’s under DSM IV, even if the ADHD traits are
the ones that actually cause them the most trouble?</li>
<li>Is autism a term for a particular sub-type of ADHD, and should it
automatically come with an ADHD diagnosis?</li>
<li>Should ADHD interventions and medications be tried more often for
those whose diagnosis is just autism?</li>
<li>Should there be more mechanisms available for people to switch
from an autism diagnosis to an ADHD diagnosis, or vice versa?</li>
<li>If so, how can these mechanisms be made available to children who are not
capable of effective self-advocacy?</li>
</ul>
<h1 id="adhd-and-autism-as-spectrums">ADHD and autism as spectrums</h1>
<blockquote>
<p><strong>Aside: Plural Forms</strong></p>
<p>I expect this article to have a lot of neurodivergent readership,
and we tend to be a pedantic lot, so I want to clarify something
even though it’s objectively unimportant:</p>
<p>I was really tempted to write “spectra” instead of “spectrums”
above, but as we’re discussing the metaphorical concept of a spectrum,
and not the physics concept, I thought it would be unnecessarily
confusing. The dictionary accepts the regular English plural in addition
to the Latinate one, and that is the plural I have decided to
adopt for this article.</p>
<p>After all, there’s no way that it would be appropriate to pluralize
“stigma,” when used as a mental health and disability rights term,
as “stigmata,” the classical Greek plural of that word.</p>
</blockquote>
<p>This is made even more complicated by the fact that as with autism,
ADHD traits come on a spectrum. While ADHD on its own normally
doesn’t cause the types of profound disability associated with
severe autism, it can cause serious struggles and suffering.
Part of the stigma of ADHD is that it is not taken seriously as a deeply
disabling condition, which it very much can be. Everyone knows someone
who has it, but who manages it successfully with coping skills and/or
medication. Everyone knows someone for whom it is – given medication
or coping skills – just a personality quirk. And people project that
understanding of it onto someone who has drastic problems functioning.</p>
<p>Meanwhile, mild autism is treated as a catastrophe, even when it’s
extremely mild, even when society would be better suited treating it
more like a quirk. Erring on the side of caution is still erring.</p>
<p>To paraphrase another Instagram meme that spoke to me greatly: How can
it be that ADHD and autism are such closely related disorders, but ADHD
is treated as a quirky personality trait and not taken seriously, and
autism is treated as the devil’s work that has to be eradicated?</p>
<p>Given this, if a person, especially a child, can be diagnosed
with both autism and ADHD, but the autism is mild and the ADHD is severe,
ADHD may be the more appropriate diagnosis not for any objective reason,
but simply for the reason of avoiding the stronger stigma.</p>
<p>But given the subjective nature of diagnosis, and the fact that both
disorders are (in practice if not in theory) correlated bundles of
traits, it’s even worse than that. The autism spectrum (and the ADHD
spectrum) are normally considered to range from mild autism (or ADHD)
to severe autism (or ADHD). But is there any evidence of a solid cut-off?</p>
<p>For genetic disorders, you typically either have it or you don’t. But for
disorders that have a variety of causes, many of them unknown (even if
many are heritable), that are on a spectrum of severity, there’s also the
possibility of almost having the disorder. There’s people out there who
almost have ADHD, or almost have autism.</p>
<p>There’s lots of people like this: People with “sub-clinical” ADHD
or autism, or with “some ADHD (or autism) traits.” To analogize to a
different field of medicine, they are the neurodevelopmental equivalent
of people who have to squint a little more than average to read things
far away, but don’t actually need glasses. Or people who have a little
trouble telling green apart from red, but can figure it out with mild
difficulty.</p>
<p>Perhaps such a person is one criterion short of the DSM checklist. Or
perhaps they check all the boxes, but they’ve built a life for themselves
where it isn’t a problem, and they fail to check the all-important box
of experiencing significant “impairment in functioning.”</p>
<p>The spectrum of such a disorder extends from the most severe cases to
the most mild, yes, but it doesn’t stop there. It extends through these
sub-clinical cases, and beyond, to people with normal executive functioning
(in the ADHD cases), and then great executive functioning, and then
perhaps even to people who have opposite but equally dysfunctional
traits.</p>
<blockquote>
<p><strong>Aside: An Anti-ADHD?</strong></p>
<p>This was referenced in the <a href="https://www.hcplive.com/view/adhd-focus-concerns-covid-19-infection">article where Dr. Barkley was
interviewed</a>,
where he discussed a disorder he characterized as “the opposite” of
ADHD, but I suspect it’s more complicated than that.</p>
<p>In all honesty, from the brief description, it sounded like a blend
of inattentive ADHD and mild autism to me, but perhaps I didn’t understand
it correctly. I suspect some people with either or both of these diagnoses
will fall into this new diagnosis instead, if and when it becomes available.</p>
<p>This all goes to show how much we’re still learning about this topic.</p>
</blockquote>
<p>We use the term “on the spectrum” as a euphemism to mean someone who
has autism spectrum disorder, but the term is really a misnomer. I’m not
trying to change how we talk – I know that is beyond my power – but I do
believe, in a more literal sense, everyone is somewhere on the spectrum
of how much autism they have compared to the “average person.” This
is true even if the amount of autism they have is negligible or even
negative. And likewise with ADHD.</p>
<p>And you have to draw the line somewhere, but the line can move. If it’s
a normal distribution, which is the most normal type of distribution to
occur in nature (thus the name), most people over the line are going to
be close to the line. That is to say, not only will most (ADHD or autism)
cases be mild, but a substantial portion of cases will be marginal,
and will genuinely be a matter of opinion.</p>
<p>You might assume that most people who have ADHD have clear-cut ADHD,
but that simply is untrue. Most people who have ADHD have mild ADHD, and
a substantial number barely have it. These people also need treatment,
because by definition, the line should be put so that people who are on
the ADHD side of it are impaired in functioning.</p>
<blockquote>
<p><strong>Aside: Are disorders on a spectrum “real”?</strong></p>
<p>Unfortunately, the idea that these disorders are defined
by symptoms and possibly on a spectrum that includes
neurotypical people leads some people to conclude that
therefore ADHD isn’t a “real” disorder, like in the provocative
title of <a href="https://www.psychologytoday.com/us/blog/finding-purpose/202101/is-adhd-real-disorder-or-one-end-normal-continuum">this article: “Is ADHD a Real Disorder or One End of a Normal
Continuum.”</a>
I wish I didn’t have to address this, but unfortunately, given the level
of ADHD denialism in society, which will take any excuse to deny the
reality and severity of ADHD, it’s important.</p>
<p>It is a false dichotomy to think that something can be a disorder,
or one end of a continuum, but not both. Being too far along on a
continuous spectrum can be a real medical problem. For some reason,
we have no trouble with the idea of considering “high blood pressure”
to be a disease, even though the cut-off for what’s considered high
is sometimes adjusted, and even though blood pressure readings form
a continuum. Similarly, we have no trouble taking diabetes seriously,
when it too is on a spectrum, and we even have names like “pre-diabetes”
for other ranges on the spectrum. Why we have difficulty applying similar
reasoning to neurodevelopmental disorders is beyond me.</p>
<p>ADHD is more complicated than these, because as I discuss, the line is
drawn not based on where it causes problems for the body, but where it
causes problems in context – and context changes. But that doesn’t
mean that it’s not real. Many real things are not clear-cut binaries
– few real things are clear-cut at all. ADHD, and ADHD diagnosis,
is complicated, not because ADHD is “not real,” but because it is real.</p>
</blockquote>
<h1 id="consequences">Consequences</h1>
<p>The fact that ADHD and autism are not clear-cut binaries does, however,
lead to a number of weird effects.</p>
<p>It explains why children who are young for their class are more likely
to be diagnosed with ADHD – something I am sure is true for autism
as well. They are, after all, more likely to experience “impairment
in functioning” because of their traits, because they have higher
expectations of them in their context – and these impairments are more
likely to attract the attention of the adults in their life.</p>
<p>It partially explains why the number of cases fluctuate over time.
Both rates of autism and ADHD diagnosis are on the rise, and I’m sure
part of that is attributable to better screening and better access
to health care. But perhaps some of that is also attributable to more
demands placed on our executive function, and our social conformity.</p>
<p>My impression is that society has gotten more difficult for mildly
neurodivergent people over time. On the autism side, society has gotten
more and more complex, more ironic, less rule-driven, and more informal –
that is, there are more unwritten rules (that are changing faster than
ever between generations), and fewer explicit ones. On the ADHD side,
more and more distracting devices and social media apps degrade our
attention span, as we’re expected to navigate increasingly Kafkaesque
bureaucracies with less and less social support. I could write an entire
blog post on how society is getting less ADHD-friendly.</p>
<p>This “spectrum effect” almost certainly explains why many adults “grow
out of” their childhood ADHD – they didn’t actually grow out of it in
the sense that they’re now in a discretely different category. Rather,
what happened is that they matured and improved in absolute terms with
respect to their executive function (as everyone does), and also developed
coping mechanisms (as everyone does to make up for those situations where
their executive function doesn’t naturally reach the task at hand). In so
doing, they drifted over the line of diagnosability and clinicality, but
most such people are almost certainly still on the “ADHD” side of things.</p>
<p>Autism is seen as incurable out of recognition that you can’t ever
discretely jump categories. ADHD is seen as something you can grow out
of in recognition that you can drift over the line from disorder to
quirk. I believe from personal experience that this is a difference in
attitude rather than fact – that those formerly ADHD children who become
“neurotypical” adults are better described as no longer “clinically”
ADHD than no longer ADHD at all. And, likewise, there are likely plenty of
people whose childhood autism spectrum diagnosis (e.g. Asperger syndrome)
may well have been valid, but who as adults would never be able to be
diagnosed with it if evaluated from square one.</p>
<p>This explains at least part of why ADHD diagnoses went up so dramatically
during COVID – people’s coping mechanisms were shattered by the
restrictions and lock-downs, or by the stress and anxiety of avoiding
the disease, or by the political turmoil. I know my ADHD and anxiety
got much worse over COVID, so that I felt I’d lost 5 years of maturity
and emotional progress. I’m not surprised that it brought some people
over the line.</p>
<p>But given that ADHD and autism are so clearly connected, this has other
consequences as well. Severe or unmedicated ADHD can be as disabling
as mild autism, but different. It does come with social difficulties, often
(but clearly not always) different from the autism ones. Can you tell
the difference between a deficit in social performance and a deficit in
social understanding, especially in a child? They raise many of the same
red flags.</p>
<p>This leads to the following odd effect: Severe ADHD can often look like
mild autism. I don’t mean just to the untrained eye; I mean also to
experienced professionals. And in many cases they do go together; severe
ADHD often comes with autism. But in some cases, severe ADHD gets mistaken
for autism when it is not autism, especially because people will assume
that ADHD is mostly relatively mild, and that therefore severe problems
with functioning must correspond to a more severe diagnosis.</p>
<p>In situations like this, if the autism traits are mild enough, the ADHD
will sometimes be the disorder that requires more treatment, or even the
only disorder that is severe enough to be clinical and require treatment
at all. But if it is diagnosed, autism is the disorder that causes more
concern, and gets more institutional attention.</p>
<p>Sometimes, this institutional attention is a good thing, and can be
used to get treatment that can then be tailored to the individual.
But sometimes, it results in ill-tailored, overdone treatment instead, and
all the stigma that comes with it. And of course, a lot of those more
extreme treatments are never appropriate for anyone: Even when extreme
interventions are in fact called for, not all extreme interventions
are created equal.</p>
<h1 id="thoughts-on-neurodiversity-culture-vs-medical-perspectives">Thoughts on “neurodiversity culture” vs medical perspectives</h1>
<p>I do not want to arrive at the conclusion that professionals should
untangle this by uncritically taking as fact everything neurodivergent
people say, especially on the Internet. Internet neurodiversity culture
has plenty of its own issues, and some of that is an insistence on
believing everyone’s experiences that has spilled over into believing
everyone’s conclusions, even if they’re questionable. Half-baked opinions
are asserted as gospel truth, to be dissented from only on pain of
extreme social censure – which is hard for people who struggle with
any of these disorders to deal with proportionately.</p>
<p>Self-diagnoses and peer diagnoses are common. This is understandable
because it helps people find coping mechanisms that are useful to them and
answer their questions. But it can also be problematic, because sometimes
important and useful treatments are missed. And people who have some ADHD
or autism traits – which absolutely everyone can show from time to time
– can trigger these informal diagnoses that are then also treated as
unquestionable dogmas. And, of course, perfectly universal experiences
are sometimes presented as signs of neurodivergence – sometimes because
neurodivergent people experience them moderately more often than average,
and sometimes just because it’s hard to tell subjectively what’s part
of your disorder and what’s just a part of normal life.</p>
<p>But I would ask professionals (and parents,
teachers, and loved ones – “hearts,” as <a href="https://www.youtube.com/channel/UC-nPM1_kSZf91ZGkcgy_95Q">How to
ADHD</a> calls
them) to take neurodiversity culture seriously, even if not always at face
value. Please listen, but with a grain of salt. It’s a complicated nuance,
and nuance is one of the hardest things a person can ever accomplish,
but I think it’s possible.</p>
<p>That includes this blog post – I hope that everyone reading this believes
my experiences (and most of my knowledge and opinions are very strongly
derived from extensive personal and vicarious experiences). I hope
that my readers take my arguments and reasoning seriously, because it
is greatly informed by both my experience and the huge amount of both
research and consideration I’ve poured into this topic – consideration,
again, heavily influenced by a deep familiarity with the facts on the
ground.</p>
<p>But that does not mean that I’m necessarily right about all of my
conclusions, even where I speak confidently. This is a complicated
issue – as I hope I have conveyed – so it’s hard for anyone
to be completely right about it. But also, this is not a professional
interest of mine. I have studied and contemplated this topic as thoroughly
as I have not because I have taken classes on it, or been naturally
interested in it (especially in a “hyperfocus” or “special interest”
kind of way – I could write an entire blog post about that terminology
as well), but because I have been repeatedly forced to by circumstances –
both mine, and those of other neurodiverse people in my life.</p>
<p>I know that detracts from my credibility in some ways, but hopefully
adds to it in others, and that people take me seriously even when I fail
to use the exact right terminology <em>du jour</em>, whether that be medical
terminology or cultural neurodiversity terminology.</p>
<h1 id="takeaways">Takeaways</h1>
<p>If there’s anything I’d ask people to take away from this, it’s that
neurodivergence is anything but simple and straight-forward. Neither
autism nor ADHD is a discrete disorder with an objective test. The
way we organize symptoms and traits into diagnoses is arbitrary and
imperfect; we can only hope that it will improve over time.</p>
<p>That said, ADHD medication is extremely effective, and stimulant
medications are among the most effective and well-proven treatments
available. I personally take Strattera, a non-stimulant, and it has been
life-changing for me, addressing many issues that have caused me real
problems throughout my life.</p>
<p>We cannot just stop prescribing Adderall because ADHD is hard to define.
We can’t just wait until we’ve pinned down these definitions more
to treat it. Whether or not it is caused by one or many underlying
mental disorders or mental differences, ADHD is a label for very
serious symptoms, and it is only properly diagnosed when there is
an impairment in functioning – which there often is. It leads to
vastly worse life outcomes, worse career performance, more spending
(the “ADHD tax”), and in too many cases, poverty. As I mentioned in
the introduction, it is objectively linked to drastically <a href="https://www.additudemag.com/adhd-life-expectancy-russell-barkley/">lower life
expectancy</a>.
It is fundamentally mistaken to treat it as so categorically less severe
and serious than autism when it is so closely related – and when it is
so readily treatable with medication.</p>
Rust and Default Parametershttps://www.thecodedmessage.com/posts/default-params/2023-01-11T00:00:00+00:00Rust doesn’t support default parameters in function signatures. And unlike in many languages, there’s no way to simulate them with function overloading. This is frustrating for many new Rustaceans coming from other programming languages, so I want to explain why this is actually a good thing, and how to use the Default trait and struct update syntax to achieve similar results.
Default parameters (and function overloading) are not part of object-oriented programming, but they are a common feature of a lot of the programming languages new Rustaceans are coming from.<p>Rust doesn’t support default parameters in function signatures.
And unlike in many languages, there’s no way to simulate them
with function overloading. This is frustrating for many new Rustaceans
coming from other programming languages, so I want to explain
why this is actually a good thing, and how to use the <a href="https://doc.rust-lang.org/std/default/trait.Default.html"><code>Default</code>
trait</a>
and <a href="https://doc.rust-lang.org/book/ch05-01-defining-structs.html#creating-instances-from-other-instances-with-struct-update-syntax">struct update
syntax</a>
to achieve similar results.</p>
<p>Default parameters (and function overloading) are not part of
object-oriented programming, but they are a common feature of a lot
of the programming languages new Rustaceans are coming from. This
post therefore fits in some ways with my <a href="https://www.thecodedmessage.com/tags/beyond-oop/">on-going series on how Rust
is not object-oriented</a>, and so it is tagged with
that series. It was also inspired by Reddit responses to <a href="https://www.thecodedmessage.com/posts/oop-1-encapsulation">my first OOP
post</a>.</p>
<h1 id="how-default-parameters-work-in-eg-c">How Default Parameters Work (in e.g. C++)</h1>
<p>So before I talk about why Rust doesn’t have default parameters
and what you can do instead, let’s talk a bit about what default
parameters are and the situations in which they are useful.</p>
<p>Let’s say you have a function that takes many parameters, perhaps
(to take an example from the Reddit response) one that creates a window
in a GUI:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>WindowHandle <span style="color:#a6e22e">createWindow</span>(<span style="color:#66d9ef">int</span> width, <span style="color:#66d9ef">int</span> height, <span style="color:#66d9ef">bool</span> visible)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">auto</span> handle <span style="color:#f92672">=</span> createWindow(<span style="color:#ae81ff">10</span>, <span style="color:#ae81ff">30</span>, false); <span style="color:#75715e">// Create invisible window
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">auto</span> handle2 <span style="color:#f92672">=</span> createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, true); <span style="color:#75715e">// Create visible window
</span></span></span></code></pre></div><p>Now, let’s say that you assume that most windows that are created
are intended to be visible, and you don’t want to burden the programmer
with having to specify whether the window is visible – or even think
about it explicitly – in that normal case. In a programming language
that supported default parameters, you could then provide
a default for <code>visible</code>.</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>WindowHandle <span style="color:#a6e22e">createWindow</span>(<span style="color:#66d9ef">int</span> width, <span style="color:#66d9ef">int</span> height, <span style="color:#66d9ef">bool</span> visible <span style="color:#f92672">=</span> true)
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">auto</span> handle <span style="color:#f92672">=</span> createWindow(<span style="color:#ae81ff">10</span>, <span style="color:#ae81ff">30</span>, false); <span style="color:#75715e">// Create invisible window!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">auto</span> handle2 <span style="color:#f92672">=</span> createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, true); <span style="color:#75715e">// Create visible window!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">auto</span> handle3 <span style="color:#f92672">=</span> createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>); <span style="color:#75715e">// Also create visible window!
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">auto</span> handle4 <span style="color:#f92672">=</span> createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>); <span style="color:#75715e">// Most of the time, that's what
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">auto</span> handle5 <span style="color:#f92672">=</span> createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>); <span style="color:#75715e">// you want, so why have to say it?
</span></span></span></code></pre></div><p>Default parameters can also be simulated with function overloading
for programming languages where function overloading is available
but default parameters are not:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>WindowHandle <span style="color:#a6e22e">createWindow</span>(<span style="color:#66d9ef">int</span> width, <span style="color:#66d9ef">int</span> height, <span style="color:#66d9ef">bool</span> visible);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>WindowHandle <span style="color:#a6e22e">createWindow</span>(<span style="color:#66d9ef">int</span> width, <span style="color:#66d9ef">int</span> height) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> createWindow(width, height, true);
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Rust also does not have function overloading, and that’s a much more
complicated issue, but many of the same arguments apply to this idiom.</p>
<h2 id="benefits-and-detriments-of-default-parameters">Benefits (and Detriments) of Default Parameters</h2>
<p>Defaults are good, and default parameters in this style are
one way to implement them and reap their benefits.</p>
<p>Defaults are good because they uphold the DRY principle – Don’t Repeat
Yourself. If we didn’t have defaults, we’d have to repeat parameters
that don’t actually contribute to understanding of the goals of the code.
And if the best default parameters changed in such a way that the best way
to update the code was to continue using the default – perhaps because
of a change of best practices – we’d have to update every call rather
than just changing it once, where the default parameter is defined.</p>
<p>Defaults are also good because they decrease the programmer’s cognitive
load. Programmers have to keep a lot of information in their brain at
a time, and defaults help programmers by not forcing them to think about
extra details when they don’t matter – which is the usual situation for
most defaults.</p>
<p>Default parameters also make the code more concise, and are popular for
that reason. But this isn’t a particular value that I have. I believe
the DRY principle is important, and that often amounts to more concise
code, but given modern editors and IDE, and modern expectations of typing
and reading speed, a moderate amount of verbosity in exchange for other
benefits (such as clarity and explicitness) is completely acceptable to me.
I believe that default parameters, as they are implemented in C++ and Python,
have a substantial cost in clarity and explicitness, and therefore conciseness
isn’t a good enough reason to justify them.</p>
<p>In this case, what particularly bothers me about the lack of clarity
is that the reader of the code doesn’t know that there are potentially
more parameters; there is no hint that there might be other parameters.
If a maintenance programmer wants to change one of these calls to make
invisible windows instead, they might not realize they should check the
documentation for <code>create_window</code>: after all, it only seems to take
two parameters, and neither of them have anything remotely to do with
invisible windows.</p>
<p>Fortunately, Rust has alternative features that allow us to reap the
benefits for cognitive load and DRY without sacrificing explicitness
and clarity.</p>
<h1 id="defaults-in-rust-the-default-trait">Defaults in Rust: the <code>Default</code> trait</h1>
<p>Rather than allowing default parameters, Rust allows you to optionally
specify default values for your types using the <a href="https://doc.rust-lang.org/std/default/trait.Default.html"><code>Default</code>
trait</a>. Here’s
how it works:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span> Bar,
</span></span><span style="display:flex;"><span> Baz,
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> Default <span style="color:#66d9ef">for</span> Foo {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">default</span>() -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> Foo::Bar
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Or, written using the more concise <code>derive</code> syntax:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[derive(Default)]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">Foo</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">#[default]</span>
</span></span><span style="display:flex;"><span> Bar,
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> Baz,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Once this default is defined, <code>Foo::default()</code> or even (in a context
where the type is clear) <code>Default::default()</code> can stand in for <code>Foo::Bar</code>.</p>
<p>If you are used to re-using existing types for your function
parameters, this might seem worse than useless. After
all, the parameter we defaulted was of type <code>bool</code>,
and the orphan rule (explained in the Rust book’s <a href="https://doc.rust-lang.org/book/ch10-02-traits.html#implementing-a-trait-on-a-type">chapter on
traits</a>)
forbids us from defining the <code>Default</code> trait on <code>bool</code> – as I alluded
to above, <code>Default</code> allows you to define default values for <em>your</em> types.
And even if we could, setting a default on booleans is way too overpowered
a thing to do just to give this one function parameter have a default!
After all, some other function might also have a boolean parameter with
a different default.</p>
<p>But this makes more sense if you consider that in Rust, it is common –
even idiomatic and preferred – to create custom types for things like
configuration and function parameters. After all, if you’re not looking
at the documentation, it can be unclear what <code>true</code> means. It’s not even
clear that it has anything to do with visibility, let alone that <code>true</code>
means that the window is to be visible when the parameter could just as
easily be called <code>invisible</code>.</p>
<p>In Rust, we would prefer to define a new type for this situation, an
<code>enum</code> listing the visibility options – which will also help if a new
visibility option is created. And on this <code>enum</code>, it would be reasonable
to declare a default:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#75715e">#[derive(Default)]</span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">enum</span> <span style="color:#a6e22e">WindowVisibility</span> {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">#[default]</span>
</span></span><span style="display:flex;"><span> Visible,
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> Invisible,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Yes, this is more verbosity, but it is more clear, and no less DRY, than
our original code. Conciseness is again not a value in and of itself.
Explicitly listing the options is preferred to leaving them implicit.</p>
<p>Then, when we call the function, we can use this default:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">create_window</span>(width: <span style="color:#66d9ef">u32</span>, height: <span style="color:#66d9ef">u32</span>, visibility: <span style="color:#a6e22e">WindowVisibility</span>) -> <span style="color:#a6e22e">WindowHandle</span>;
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle <span style="color:#f92672">=</span> create_window(<span style="color:#ae81ff">10</span>, <span style="color:#ae81ff">30</span>, WindowVisibility::Invisible);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle2 <span style="color:#f92672">=</span> create_window(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, WindowVisibility::Visible);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle3 <span style="color:#f92672">=</span> create_window(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, WindowVisibility::default());
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle4 <span style="color:#f92672">=</span> create_window(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, WindowVisibility::default());
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle5 <span style="color:#f92672">=</span> create_window(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, Default::default()); <span style="color:#75715e">// Also permitted
</span></span></span></code></pre></div><p>This is, as promised, more verbose, but equally DRY, and much more explicit
and clear.</p>
<p>NB: I’m using free-standing functions for example purposes
only. In reality, this particular function is just as likely to be part
of a type’s intrinsic methods, something like <code>WindowHandle::new</code> or
<code>WindowHandle::create_window</code>.</p>
<h2 id="scaling-defaults-in-rust-struct-update-syntax">Scaling defaults in Rust: Struct update syntax</h2>
<p>So this is all well and good for one default. But it doesn’t
scale that well. What if we want to add another 3 parameters to our
window creation function? In a language like C++, we can give them
defaults, and the callers don’t even need to be updated
(parameters are for example purposes only and do not represent
a well-thought out list of what you might want to specify in creating
a window):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>WindowHandle <span style="color:#a6e22e">createWindow</span>(<span style="color:#66d9ef">int</span> width, <span style="color:#66d9ef">int</span> height, <span style="color:#66d9ef">bool</span> visible <span style="color:#f92672">=</span> true,
</span></span><span style="display:flex;"><span> WindowStyle windowStyle <span style="color:#f92672">=</span> WindowStyle<span style="color:#f92672">::</span>Standard,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> z_position <span style="color:#f92672">=</span> <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">bool</span> autoclose <span style="color:#f92672">=</span> false);
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span>createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>); <span style="color:#75715e">// Still works identically
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, false); <span style="color:#75715e">// Also still works
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>createWindow(<span style="color:#ae81ff">100</span>, <span style="color:#ae81ff">500</span>, false, WindowStyle<span style="color:#f92672">::</span>Standard, <span style="color:#ae81ff">2</span>, true); <span style="color:#75715e">// Specify everything
</span></span></span></code></pre></div><p>This is a useful feature. In Rust, with the techniques we’ve discussed
so far, we’d have to write <code>Default::default()</code> repeatedly for however
many parameters there are. This is a DRY violation, and interferes with
the ability to add new parameters.</p>
<p>There is a flaw with this feature, however. You’ve now constrained
yourself to specifying parameters to the left in order to specify
parameters on the right. In the last example call to <code>createWindow</code>, we
violate DRY by explicitly specifying a value when we probably wanted to
use the default, but that wasn’t available because we wanted to override
the default for a later parameter.</p>
<p>Fortunately, Rust has a version of this too. Just as we created an
<code>enum</code> just for the purposes of this function call, it is idiomatic
in Rust to create structures for configuration parameters like this.
The structure would look something like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">WindowConfig</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> width: <span style="color:#66d9ef">u32</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> height: <span style="color:#66d9ef">u32</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> visibility: <span style="color:#a6e22e">WindowVisibility</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> window_style: <span style="color:#a6e22e">WindowStyle</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> z_position: <span style="color:#66d9ef">i32</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> autoclose: <span style="color:#a6e22e">AutoclosePolicy</span>,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Then, we can implement <code>Default</code> for that entire <code>struct</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> Default <span style="color:#66d9ef">for</span> WindowConfig {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">default</span>() -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> Self {
</span></span><span style="display:flex;"><span> width: <span style="color:#ae81ff">100</span>,
</span></span><span style="display:flex;"><span> height: <span style="color:#ae81ff">100</span>,
</span></span><span style="display:flex;"><span> visibility: <span style="color:#a6e22e">WindowVisibility</span>::Visible,
</span></span><span style="display:flex;"><span> window_style: <span style="color:#a6e22e">WindowStyle</span>::Standard,
</span></span><span style="display:flex;"><span> z_position: <span style="color:#f92672">-</span><span style="color:#ae81ff">1</span>,
</span></span><span style="display:flex;"><span> autoclose: <span style="color:#a6e22e">AutoclosePolicy</span>::Disable,
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Now, this might seem to be extremely tedious to use. You might imagine
using it something like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> <span style="color:#66d9ef">mut</span> config <span style="color:#f92672">=</span> WindowConfig::default();
</span></span><span style="display:flex;"><span>config.width <span style="color:#f92672">=</span> <span style="color:#ae81ff">500</span>;
</span></span><span style="display:flex;"><span>config.z_position <span style="color:#f92672">=</span> <span style="color:#ae81ff">2</span>;
</span></span><span style="display:flex;"><span>config.autoclose <span style="color:#f92672">=</span> AutoclosePolicy::Enable;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle <span style="color:#f92672">=</span> create_window(config);
</span></span></code></pre></div><p>I would argue that even this is preferable to default parameters,
because again, it is explicit. However, Rust has a syntactic construct
designed exactly for situations like this, <a href="https://doc.rust-lang.org/book/ch05-01-defining-structs.html#creating-instances-from-other-instances-with-struct-update-syntax">struct update
syntax</a>.
With it, we get something very similar to default parameters, but
a little more verbose, a lot more explicit, and a lot more flexible:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle <span style="color:#f92672">=</span> create_window(WindowConfig {
</span></span><span style="display:flex;"><span> width: <span style="color:#ae81ff">500</span>,
</span></span><span style="display:flex;"><span> z_position: <span style="color:#ae81ff">2</span>,
</span></span><span style="display:flex;"><span> autoclose: <span style="color:#a6e22e">AutoclosePolicy</span>::Enable,
</span></span><span style="display:flex;"><span> <span style="color:#f92672">..</span>Default::default()
</span></span><span style="display:flex;"><span>});
</span></span></code></pre></div><p>Unlike C++-style default parameters, we can override exactly the
defaults we want to. It is also explicitly clear that there are
other parameters we could modify if we wanted to, without forcing
the maintenance programmer to check the documentation.</p>
<p>But beyond that, this allows there to be other sets of defaults
defined. In addition to <code>WindowConfig::default</code>, there might be
another set of configuration parameters for creating dialog boxes,
like <code>WindowConfig::dialog()</code> or <code>WindowConfig::default_dialog</code>.
An app where the programmer usually creates invisible windows, or
windows all of the same height, might define its own default set,
<code>config::app_local_default_window_config()</code>. These wouldn’t be mediated
through the <code>Default</code> trait, but <code>Default</code> is just a trait, and
<code>Default::default()</code> is just a method call. You can call your own
methods instead, and still use this struct update syntax.</p>
<p>So now, we have a system of idioms in Rust to replace default parameters.
It’s just as DRY, and decreases the cognitive load just as much. More
importantly, it does so without sacrificing explicitness and clarity as to
exactly what’s going on – a given function always takes the same number
of parameters, which is an invariant that Rust maintenance programmers
can (and do) rely on.</p>
<h2 id="the-builder-pattern">The Builder Pattern</h2>
<p>At this point, the old-hand Rustaceans in the audience will
note that I haven’t discussed one common Rust approach
to designing these configuration structs, <a href="https://rust-unofficial.github.io/patterns/patterns/creational/builder.html">the builder
pattern</a>.</p>
<p>That’s for a reason: I don’t like it. I personally prefer to use
<code>Default</code> and struct update syntax where others might reach for
the builder pattern. I think it’s less explicit, and since I
have a lot of experience in non-OOP programming languages,
it feels to me like a solution without a problem, the primary
upshot of which is to make the code look more object-oriented.</p>
<p>But it is a commonly used pattern in Rust, and you will use crates
that use the builder pattern, so it’s worth being familiar with it.
It’s the same concept as before: using a struct full of parameters
to send configuration to a constructor or to a function call.
It’s probably going to be called something like <code>WindowBuilder</code>
instead of <code>WindowConfig</code>.</p>
<p>However, instead of using the struct update syntax directly,
a bunch of helper methods are added to do the
struct update:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> WindowBuilder {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">height</span>(<span style="color:#66d9ef">mut</span> self, height: <span style="color:#66d9ef">u32</span>) -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> self.height <span style="color:#f92672">=</span> height;
</span></span><span style="display:flex;"><span> self
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>Or, as I would notate it:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> WindowBuilder {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">height</span>(self, height: <span style="color:#66d9ef">u32</span>) -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> Self {
</span></span><span style="display:flex;"><span> height,
</span></span><span style="display:flex;"><span> <span style="color:#f92672">..</span>self
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// ...
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>Sometimes, enumerations are split into multiple update methods:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> WindowBuilder {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">autoclose_enable</span>(<span style="color:#66d9ef">mut</span> self) -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> self.autoclose <span style="color:#f92672">=</span> AutoclosePolicy::Enable;
</span></span><span style="display:flex;"><span> self
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">autoclose_disable</span>(<span style="color:#66d9ef">mut</span> self) -> <span style="color:#a6e22e">Self</span> {
</span></span><span style="display:flex;"><span> self.autoclose <span style="color:#f92672">=</span> AutoclosePolicy::Disable;
</span></span><span style="display:flex;"><span> self
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Then, normally, instead of calling e.g. the window constructor, you call
a <code>build</code> method defined on the builder (and at this point I cringe
at the gratuitous OOP philosophy influencing the design):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">impl</span> WindowBuilder {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">build</span>(self) {
</span></span><span style="display:flex;"><span> window_create(self)
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Then, instead of using struct update syntax, you chain together
calls to these methods:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">let</span> handle <span style="color:#f92672">=</span> WindowBuilder::new()
</span></span><span style="display:flex;"><span> .width(<span style="color:#ae81ff">500</span>)
</span></span><span style="display:flex;"><span> .z_position(<span style="color:#ae81ff">2</span>)
</span></span><span style="display:flex;"><span> .autoclose_enable()
</span></span><span style="display:flex;"><span> .build();
</span></span></code></pre></div><p>I still prefer this to default parameters, but I also find it tacky.
I don’t like being forced to think in terms of abstract “objects” like
builders, and I don’t like the presumption that this style is more
intuitive. Why is a “builder” an object that does something? Why is
that prefered to a structure that is “configuration”? Are OOP programmers
aware that in real life, the vast majority of objects literally don’t do
things, and certainly don’t build other objects?</p>
<p>But for people familiar with the idioms of object-oriented programming,
this might be preferable. It is a commonly chosen option, so it’s important
at least to recognize it.</p>
<h1 id="conclusion-and-application">Conclusion and Application</h1>
<p>Rust has a lot of idioms that are different from those in other
programming languages. I often see proposals from new Rustaceans
to add default parameters – and other similar features – to Rust,
and these new Rustaceans are confused that the strong demand they
feel is not as widely felt in the greater Rust community.</p>
<p>And normally, it’s similar to this situation with default parameters.
There are alternative idioms that accomplish the same goals, to the
extent that those goals are in line with Rust’s values: in this case,
DRYness, and reducing developers’ cognitive loads. They are also better
solutions in some other ways, according to Rusty values: the additional
explicitness is worth a little more verbosity.</p>
<p>But often, the new Rustaceans making these proposals are unaware of
the Rusty way of doing things. And if they are aware of it, they
are approaching it from the goals of other programming languages,
and don’t see how the solution measures up.</p>
<p>So I hope this can serve as a case study to help people understand
that there often are Rusty ways of accomplishing the goals of
popular features from OOP land, and why Rustaceans prefer these
solutions to blind accumulation of features.</p>
Christmas Disappointment: Smashing Princes and Citieshttps://www.thecodedmessage.com/posts/princes-and-cities/2023-01-03T00:00:00+00:00Today, in liturgical Western Christianity, it is the 10th day of Christmas. Merry Christmas to those who celebrate the extended edition of the holiday!
Unfortunately, this essay is not a celebration of Christmas, but rather an explanation of why I have often found it disappointing recently in life, because of a disconnect between the promise and the reality.
Every time Christmas comes around, I think of a classical sacred choral piece that I’ve performed in multiple different choirs in youth and adulthood, from Mendelssohn’s Christus, namely “Es Wird ein Stern aus Jakob Aufgeh’n” (“There shall come a star out of Jacob”).<p>Today, in liturgical Western Christianity, it is the 10th day of
Christmas. Merry Christmas to those who celebrate the extended edition
of the holiday!</p>
<p>Unfortunately, this essay is not a celebration of Christmas, but rather
an explanation of why I have often found it disappointing recently in life,
because of a disconnect between the promise and the reality.</p>
<p>Every time Christmas comes around, I think of a classical sacred choral
piece that I’ve performed in multiple different choirs in youth and
adulthood, from Mendelssohn’s <em>Christus</em>, namely “Es Wird ein Stern aus
Jakob Aufgeh’n” (“There shall come a star out of Jacob”).</p>
<p>These are the words:</p>
<blockquote>
<p><em>Es wird ein Stern aus Jakob aufgeh’n</em><br>
<em>Und ein Scepter aus Israel kommen;</em><br>
<em>Der wird zerschmettern Fürsten und Städte</em></p>
</blockquote>
<p>The times we’d sung it in English, this was sung as:</p>
<blockquote>
<p>There shall a star come out of Jacob<br>
and a scepter shall rise out of Israel,<br>
with might destroying princes and cities.</p>
</blockquote>
<p>And, of course, the line about destroying princes and cities is sung
loudly, calamitously, and – perhaps strangely – triumphantly. I remember
having a discussion once with someone who was confused by this, but
then suddenly got it: “Oh!! It’s a <em>good</em> destroying princes and cities!”</p>
<p>And the English “destroying” is too weak. Without the constraint of a
singable text, I’d translate the German thus:</p>
<blockquote>
<p>There shall arise a star out of Jacob<br>
And a scepter shall come out of Israel<br>
He shall shatter sovereigns and cities</p>
</blockquote>
<p><em>Fürsten</em> is normally translated “princes,” but means “princes” in
the sense of “sovereign leaders of principalities” or “heads of state,”
rather than in the more common modern sense of “sons of monarchs.”</p>
<p>This text is based off of a Bible verse, Numbers 24:17, where specific
enemy nations of Israel are called out as those this “star” will destroy:</p>
<blockquote>
<p>I see him, but not now;<br>
I behold him, but not near:<br>
a star shall come out of Jacob,<br>
and a scepter shall rise out of Israel;<br>
it shall crush the forehead of Moab<br>
and break down all the sons of Sheth.</p>
<ul>
<li>Numbers 24:17 ESV</li>
</ul>
</blockquote>
<p>Of course, like many things in the Hebrew Bible, Christians reinterpreted
this text to be a Messianic prophecy about Jesus. Jesus made no literal
war against anybody, let alone Moab and the sons of Sheth, so that line
was reinterpreted as some sort of metonym, perhaps meaning “any power
of this earth opposed to God’s people.”</p>
<p>In the Mendelssohn piece, the transformation goes further, and the
text replaces the specific with the general: the star, the Christ, will
“shatter princes and cities.”</p>
<p>This is seen as a good thing. The feeling that I have always gotten
from this section of the piece is reminiscent of a leftist jubilantly
speaking of “dismantling power structures,” but with much more poetic,
less wonky language. The rulers and their strongholds will be shattered.
The people who are currently in charge – and who are clearly responsible
for many of the world’s problems due to their selfishness – will be
violently set aside.</p>
<p>It is reminiscent of Jesus’s declaration that “the first will be last,
and the last will be first,” or the similarly anti-rich and anti-powerful
declaration in the “Magnificat” (Luke 1:51-53, <em>Book of Common Prayer</em>):</p>
<blockquote>
<p>[God] hath shewed strength with his arm:<br>
he hath scattered the proud in the imagination of their hearts.<br>
He hath put down the mighty from their seat:<br>
and hath exalted the humble and meek.<br>
He hath filled the hungry with good things:<br>
and the rich he hath sent empty away.</p>
</blockquote>
<p>All of this is connected to Christmas in that Jesus, as the Christ,
is supposed to be the “star of Jacob.” It is Jesus that is supposed
to shatter the princes and cities, and Christmas celebrates his birth
and coming into the world.</p>
<p>The problem, and the part that makes me sad when I think about this song,
is that princes and cities still are around. And, unfortunately, they are
as oppressive as ever.</p>
<p>Recently, in the US, we’ve had the particularly intense prince that
is President Trump, and the city that was his version of Washington,
DC. And throughout the world we have others: There is Vladimir Putin,
and his well-fortified Moscow, protected with nuclear weapons rather
than walls. There is Kim Jung Un in North Korea, and Xi Jinping in
China. These figures come off like movie villains – or like the many evil
sovereigns in the Bible.</p>
<p>And they all remain thoroughly unshattered. They’re still there, even
though Christmas happened 2022 years ago, when Jesus was
born to shatter them.</p>
<p>I could recite to you all sorts of standard Christian explanations
for this: that the princes and cities have in fact been shattered, in a
spiritual sense; that this is a process that happens over time through
the church; and that the final, literal shattering will occur once Jesus
returns.</p>
<p>But that honestly seems like a moving goalpost. The piece doesn’t say “a star
shall arise and then set, and then arise three days later, and then
eventually after thousands of years shall shatter princes and cities.”</p>
<p>In the early 20th century, the worldwide Communist movement tried to
take matters into their own hand, and shatter the princes and cities by
means of violent revolutions. But of course, they couldn’t shatter human
nature. After the Tsar and the oligarchs were shattered, new ones
arose. Stalin and Mao Tse Tung were even worse princes than those
who came before them.</p>
<p>That is unfortunately <a href="https://www.smbc-comics.com/index.php?id=3246#comic">how revolution usually
goes</a>. Given that,
you can see why some feel the need for a divinely-appointed hero to
come in and do the shattering, who will reign with justice afterwards
rather than requiring another round of shattering in another couple
decades.</p>
<p>But Jesus literally didn’t do that. And no one seems to be forthcoming.
And if they were, they’d be more likely to be another Stalin than a
“star of Jacob.”</p>
<p>The Mendelssohn is a beautiful piece, but now, it just reminds me of
the evil princes and cities that remain unshattered. The promised shattering
feels like just a fantasy.</p>
A Life (and Blog) Theme for the Coming Yearhttps://www.thecodedmessage.com/posts/review-2022/2022-12-21T00:00:00+00:00Happy December! Happy Winter Holidays! We’re almost done with 2022!
I just had my birthday yesterday, on December 20. I am now 34 years old, which is more than a third of a century! I generally take the opportunity on my birthday to do some reflection on the previous year, and to set a theme for the next year. I wanted to share both with you, my audience.
The past year has been intense for me personally.<p>Happy December! Happy Winter Holidays! We’re almost done with 2022!</p>
<p>I just had my birthday yesterday, on December 20. I am now
34 years old, which is more than a third of a century! I
generally take the opportunity on my birthday to do some
reflection on the previous year, and to set <a href="https://chadd.org/adhd-weekly/skip-the-resolutions-pick-a-new-years-theme/">a theme for the next
year</a>.
I wanted to share both with you, my audience.</p>
<p>The past year has been intense for me personally. It’s just been a
laundry list of life changes and achievements:</p>
<ul>
<li>I moved from New York City to a small town.</li>
<li>I bought a house and <a href="https://www.thecodedmessage.com/posts/mortgage_interest/">got a mortgage</a>.</li>
<li>I started medication (Atomoxetine) for my <a href="https://www.thecodedmessage.com/tags/adhd/">ADHD</a>, an experience I want to blog about, and will
as soon as I figure out what exactly I actually have to say about it.</li>
<li>I’ve made new friends and deepened other friendships.</li>
<li>I got an (antique family) upright piano installed in my house.</li>
<li>I got a home gym installation.</li>
<li>I’ve greatly refined and improved my <a href="https://www.thecodedmessage.com/tags/write-everything-down/">organizational system</a>.</li>
<li>I’ve started a weekly rock-climbing habit.</li>
</ul>
<p>And, most relevantly for this audience, this blog has accelerated:</p>
<ul>
<li>I have posted far more than any previous year, averaging 3 times
per month (not including this post):</li>
</ul>
<pre tabindex="0"><code>$ ls | grep ^2 | cut -f1 -d- | uniq -c # Count posts per year
3 2017
1 2018
17 2019
5 2020
3 2021
36 2022
</code></pre><ul>
<li>I have migrated to Hugo, making posting super easy and the design
mobile-friendly without me having to do my own front-end/CSS work.</li>
<li>I added comments.</li>
<li>I added a newsletter, and just today, added a subscription form to
every page (thanks, subscribers!)</li>
<li>I’ve written or at least started several series:
<ul>
<li>Comparing <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">C++ to Rust</a></li>
<li>Explaining my <a href="https://www.thecodedmessage.com/tags/write-everything-down/">organizational system</a></li>
<li>Explaining how <a href="https://www.thecodedmessage.com/tags/beyond-oop/">Rust’s paradigm differs from OOP</a></li>
</ul>
</li>
</ul>
<p>My theme for this past year – or perhaps for 2021, I’m not sure, but
I think it was in practice a 2-year theme – was <strong>rebuilding</strong>, and
I’ve certainly done that. I’ve built a new life, in a new place, with
so many different habits. And I’m really happy with the results.</p>
<p>Now, what I really want to do is let these new habits mature, to build on
top of what I built. And so, a dear friend of mine suggested <strong>growth</strong>
as the theme for this coming year. Rather than focusing on planting new
seeds in the garden of my life, as I’ve been doing, I should let the
existing seeds that are already planted grow and mature.</p>
<p>Here is part of what I hope that will mean in practice:</p>
<ul>
<li>Exercise
<ul>
<li>Continue rock climbing</li>
<li>Use the home gym more regularly</li>
</ul>
</li>
<li>Music
<ul>
<li>Solidify the habit of practicing the piano</li>
<li>Try to learn new skills on it (perhaps getting lessons)</li>
</ul>
</li>
<li>Friendships
<ul>
<li>Deepen my relationships with the people I’ve gotten closer to (literally
and figuratively) and met over the past year</li>
</ul>
</li>
<li>Organization
<ul>
<li>Use my system more consistently</li>
<li>Continue to reap the benefits of having it at all</li>
</ul>
</li>
</ul>
<p>And of course, last but not least, this blog. I would like this blog
to grow. Getting a newsletter is the first and obvious step, but I have
some other ideas as well.</p>
<p>For one thing, I hope to take some of my blog
posts and series and turn them into <a href="https://hapgood.us/2015/10/17/the-garden-and-the-stream-a-technopastoral/">“gardens” rather than
“streams”</a>.
Rather than a series of posts that boom in readership and then get
forgotten about, organized by time-line, I think a well-organized complete
documentation on, say, my opinions about Rust vs C++, or perhaps just a <a href="https://www.thecodedmessage.com/rust-opinions/">list
of Rust opinions</a>, might be more useful as a reference,
and allow me to refine my positions over time, and address objections
(even if some on Reddit think addressing objections is somehow a sign
that I’m being disingenuous or not open-minded).</p>
<p>For another thing – speaking of Reddit – I’d like to move away from
polemic towards education and insight. Many of my posts now are about why
Rust is better than C++ – which I still think is an important topic –
but some of them are about how to do things in Rust, or how Rust works.
My <a href="https://www.thecodedmessage.com/tags/beyond-oop/">OOP series</a>, while starting out with a fair
amount of polemics, will hopefully show a lot of examples and provide
a lot of insights into how to use Rust’s features to structure code
for people who are used to OOP-style programming languages, and will
help people do their job more directly. I don’t think I’m going to be
able to fully give up polemics – I am, after all, a self-described
<a href="https://www.thecodedmessage.com/about/">“temperamental opinionator”</a> – but I’d like to spend more of
my time explaining how to use Rust and why it is the way it is, and less
time arguing with angry people on Reddit who take criticism of their
preferred tools as a personal attack.</p>
<p>I’d also like to increase the amount of non-technical blogging I do.
I used to post <a href="https://www.thecodedmessage.com/tags/fiction/">fiction</a> on my blog, and
<a href="https://www.thecodedmessage.com/posts/are-you-sure/">some of it</a> I’m even proud of. I’d like to
return to doing that. I’d like to share more of my insights about
<a href="https://www.thecodedmessage.com/tags/adhd/">ADHD</a> in specific and neurodiversity in general,
specifically about medication and the philosophical question about
what a diagnosis like ADHD even means. I’d like to write more about
religion, a complicated issue that I have some insights on.</p>
<p>And finally, I’d like to blog some about
<a href="https://reflex-frp.org/">Reflex</a>, my favorite GUI programming
framework, and one that is in Haskell and completely independent of the
object-oriented tradition. It’s an excellent hidden gem of a library,
and I think it deserves some good resources for it.</p>
<p>If any of you have any particular requests, please let me know!
Many of you have already given me ideas for new posts – I have dozens
of outlines and hundreds of ideas – but specific requests are (sometimes)
very motivating.</p>
<p>I’d like to thank you all so very much for reading! I hope you all have
had a good 2022, and I wish all of you the best possible 2023. If you
do set a new theme for the new year, may it help you live your best
life.</p>
Rust Is Beyond Object-Oriented, Part 1: Intro and Encapsulationhttps://www.thecodedmessage.com/posts/oop-1-encapsulation/2022-12-12T00:00:00+00:00Rust is not an object oriented programming language.
Rust may look like an object-oriented programming language: Types can be associated with “methods,” either “intrinsic” or through “traits.” Methods can often be invoked with C++ or Java-style OOP syntax: map.insert(key, value) or foo.clone(). Just like in an OOP language, this syntax involves a “receiver” argument placed before a . in the caller, called self in the callee.
But make no mistake: Though it may borrow some of the trappings, some of the terminology and syntax, Rust is not an object-oriented programming language.<p>Rust is not an object oriented programming language.</p>
<p>Rust may look like an object-oriented programming language: Types can be
associated with “methods,” either “intrinsic” or through “traits.” Methods
can often be invoked with C++ or Java-style OOP syntax: <code>map.insert(key, value)</code> or <code>foo.clone()</code>. Just like in an OOP language, this syntax
involves a “receiver” argument placed before a <code>.</code> in the caller,
called <code>self</code> in the callee.</p>
<p>But make no mistake: Though it may borrow some of the trappings, some of
the terminology and syntax, Rust is not an object-oriented programming
language. There are three pillars of object-oriented programming:
encapsulation, polymorphism, and inheritance. Of these, Rust nixes
inheritance entirely, so it can never be a “true” object-oriented
programming language. But even for encapsulation and polymorphism,
Rust implements them differently than OOP languages do – which we
will go into in more detail later.</p>
<p>This all comes as a surprise and an adjustment to a lot of programmers. I
see Rust newbies on Reddit asking how to implement OOP design patterns
literally, trying to get “class hierarchies” like “shapes” or “vehicles”
working with traits standing in as “the Rust version of inheritance” –
in other words, trying to solve problems they only have because they’re
committed to the OOP approach, and doing contrived OOP examples to try
to learn what they expect to be just another version of it.</p>
<p>It’s a stumbling block for many. I regularly see “lack of OOP” mentioned
on the Internet by Rust newbies and sceptics as a reason Rust is hard to
adjust to, or not a good fit for them, or even why it will never catch
on. For people who learned to program in the height of OOP as a trend –
when perfectly good languages like C and ML had to become object-oriented
as Objective-C and OCaML – the amount of hype about a non-OOP language
just feels off.</p>
<p>It’s not an easy adjustment either. So many programmers learned
software design and architecture in an explicitly object-oriented
way. I see question after question where a beginning or intermediate
Rust programmer wants to do an object-oriented thing, and want
a literal Rust equivalent. Often, these are examples of the <a href="https://xyproblem.info/">XY
problem</a>, and they have trouble backtracking
and approaching the problem in a more Rusty way.</p>
<p>But that isn’t Rust’s fault. The answer is still for us to adjust, even
if it isn’t easy; being proficient in not only multiple languages but
also different programming paradigms makes us better programmers.</p>
<p>And, as a paradigm, OOP is actually thoroughly mediocre – so much
so that I’m writing a whole blog series to explain why, and why
Rust’s approach is better.</p>
<h1 id="oop-ideology">OOP Ideology</h1>
<p>Look, I get it. I used to drink the OOP Kool-Aid myself. I
remember how it was billed to us: not as just a set of code
organization practices, but a revolution in programming. The OOP
way was held up as more intuitive, especially to non-programmers,
because it would align better with how we think of the
natural world.</p>
<p>For an archetypical example of this marketing, here is an
excerpt from the first public article about OOP in a popular
magazine (<em>Byte Magazine</em>, in 1981):</p>
<blockquote>
<p>Many people who have no idea how a computer works find the idea of
object-oriented programming quite natural. In contrast, many people
who have experience with computers initially think there is something
strange about object oriented systems.</p>
</blockquote>
<p>It was pretty easy to buy into, as well. Of course, our everyday life
doesn’t have anything like subroutines or variables – or, to the
extent that it does, we don’t think about them explicitly! But
it does have objects that we can interact with, each with its
own capabilities. How could it not be more intuitive?</p>
<p>It’s very compelling pseudo-cognitive science, light on research, heavy on
really persuasive rationales. The objects can be thought of as “agents,”
almost as people, and so you could leverage your social skills towards it
instead of just analytical thinking (never mind that objects act
nothing like people, and actually substantially dumber in a way that
still requires analytical thinking). Or, you can think of objects and
classes as an almost-platonic representation of the world of forms itself,
making it philosophically compelling.</p>
<p>And oh, how I bought in, especially in my wanton and reckless youth. I
personally soaked up the connection between OOP and Platonic philosophy. I
delved deep into meta-object protocols, and the fact that in Smalltalk
every class had to have a metaclass. The concept of the Smalltalk code
<code>Metaclass class</code> felt almost mystical to me, as the notion that any
value could be organized in the same hierarchy, with <code>Object</code> at its root.</p>
<p>I remember reading in a book that OOP-style polymorphism made <code>if</code>-<code>else</code>
statements redundant, and therefore we should strive to ultimately only
use OOP-style polymorphism. Somehow, instead of putting me off, this
excited me at the time. I was even more excited when I learned
that Smalltalk in fact does this (if you ignored implementation details
that optimize away some of this abstraction): In Smalltalk, the concept of
<code>if</code>-<code>then</code>-<code>else</code> is implemented via methods like <code>ifTrue:</code> and <code>ifFalse:</code>
and <code>ifTrue:ifFalse:</code> on the single-instance <code>True</code> and <code>False</code> classes,
with their global objects, <code>true</code> and <code>false</code>.</p>
<p>As a more mature programmer, exposed to the less ideological OOP of C++
and the alternative of functional programming in Haskell, my positions
softened, and then shifted dramatically, and now I am barely a fan
of OOP at all, especially as its best ideas have been carried on to
a newer synthesis in Haskell and Rust. I’ve realized that this hype
about new programmers is typical for any paradigm; any new programming
paradigm is more intuitive for a newbie than it is for someone who’s a
veteran programmer in a different paradigm. The same thing is said for
functional programming. The same thing is even said for Rust. It really
doesn’t have that much to do with whether a paradigm is better.</p>
<p>As for <code>if</code> statements being fully replaceable by polymorphism, well,
it’s easy to come up with a set of primitives that are Turing-complete.
You can simulate if statements with polymorphism, true. You can also
simulate while loops with recursion, or recursion with while loops
and an explicit stack. You can simulate if statements with while loops.</p>
<p>None of these facts make such substitutions a good idea. Different
features exist in a programming language for different situations,
and making them distinct is actually a good thing, in moderation.</p>
<p>After all, the point of programming is to write programs, not to make
proofs about Turing-completeness, do philosophy, or write conceptual
poetry.</p>
<h1 id="practicality">Practicality</h1>
<p>So, in this blog series, I intend to evaluate OOP in practical terms,
as a programmer with experience in what makes programming languages
cognitively more manageable or easy to do abstraction in. I will do it
in terms of my experience solving actual programming problems – I see
it as a bad sign that many examples of how OOP abstractions work only
make sense in really advanced programs or with contrived examples about
different types of shapes or animals in a zoo.</p>
<p>And unlike most introductions to OOP, I will not primarily be focusing
on how OOP compares to pre-OOP programming languages. I will instead
be comparing to Rust, which takes many of the good ideas from OOP,
and perhaps also to functional programming languages like Haskell. These
programming languages have taken some of OOP’s good ideas, but transformed
them in a way that fixes some of their flaws and moves them beyond what
can reasonably be called OOP.</p>
<p>I will organize this comparison according to the three traditional
pillars of object-oriented programming: encapsulation, polymorphism,
and inheritance, with this first article focusing on encapsulation. For
each pillar, I will discuss how OOP defines it, what equivalents or
substitutes exist outside of the OOP world, and how these compare for
practical ease and power of programming.</p>
<p>But before I jump in, I want to talk a second about a use case that turns
much of this on its head: graphical user interfaces or GUIs. Especially
before the era of the browser, writing GUI programs to run directly on
desktop (or laptop) computers was a huge part of what programmers did. A
lot of early development of OOP was done in tandem with research into
graphical user interfaces at Xerox PARC, and OOP is uniquely well-suited
for that use case. For this reason, the GUI deserves special consideration.</p>
<p>For example, it is common for people to emulate OOP in other programming
languages. <code>Gtk+</code> is a huge example of this, implementing OOP as a
series of macros and conventions in C. This is done for many reasons,
including familiarity with OOP designs and a desire to create some kind
of run-time polymorphism. But in my experience, this is most common
when implementing a GUI framework.</p>
<p>In this series of articles, we will primarily focus on applying OOP
to other use cases, but we will also discuss GUIs as appropriate.
In this introductory section, I will just point out that GUI
frameworks are clearly possible outside traditional OOP designs
and programming languages, and even in Rust. Sometimes, they work
by completely different mechanisms, like the <a href="https://reflex-frp.org/">functional-reactive
programming</a> mostly pioneered in Haskell, which
I personally prefer to traditional OOP-based programming and for which
traditional OOP features would not be helpful.</p>
<p>Now, without further ado, let us compare OOP to Rust and other
post-OOP programming languages, pillar by pillar, from a pragmatic
perspective. For the rest of this first post, we will focus on
encapsulation.</p>
<h1 id="first-pillar-encapsulation">First Pillar: Encapsulation</h1>
<p>In object-oriented programming, <strong>encapsulation</strong> is bound up with
the idea of a <strong>class</strong>, the fundamental layer of abstraction in
object-oriented programming. Each class contains a layout for some data in
a record format, that is, a data structure where each instance contains a
set number of fields. Individual instances of the record type are known
as “objects.” Each class also contains code that is tightly paired to
that record type, organized into procedures called <strong>methods</strong>. The idea
is then that all of the fields will only be accessible from inside the
methods, either by the conventions of OOP ideology or by the enforced
rules of the programming language.</p>
<p>The fundamental benefit here is that the <strong>interface</strong>, which is how
the code interacts with other code, or what you have to understand to
use the code, is much simpler than the <strong>implementation</strong>, which are
the more fluidly changing details of how the code actually accomplishes
its job.</p>
<p>But of course, lots of programming languages have abstractions like this.
Any program longer than a dozen lines has too many parts to keep in your
brain all at once, and so all remotely modern programming languages have
ways of dividing a program into smaller components, as a way to manage
the complexity, so that the interface is simpler than the implementation,
whether enforced by the programming language or a matter of the “honor
system.” So in a broader sense of the word, all modern programming languages
have some version of encapsulation.</p>
<p>One simple form of encapsulation – one that most object-oriented
programming languages maintain as a layer within the class – is
<strong>procedures</strong>, also known as functions, subroutines, or (as OOP calls them)
methods. Rather than allow any line of code to jump to any other line
of code, modern programming languages tend to group blocks of code
together into procedures, and you can then change the contents of the
procedure without affecting the outside code, and change the outside
code without affecting the procedure, as long as they follow the same
interface and contract.</p>
<p>The contract is usually at least partially a human-level
convention. There’s not usually much stopping you from taking a procedure
that is supposed to process some data and instead making it instead loop
indefinitely or crash the program. But some of it, like the separation of
the procedure from the rest of the program, and in many cases the number
and types of values it is allowed to accept and return in an invocation,
will be enforced by the programming language.</p>
<p>For example, variables declared inside the procedure are usually <em>local</em>,
and there’s generally no way to reference them outside the procedure.
The inputs and outputs are usually listed in a signature at the top of
the procedure. Normally, outside code can only enter the procedure on its
first line, rather than on an arbitrary line half-way through. In some
programming languages – including Rust – procedures can even contain
other procedures, which can only be called within the outer procedure.</p>
<p>But of course, modern programs are often more complicated than a mere
handful of procedures. And so, modern programming languages (and again,
the word “modern” here is being used in a very loose way) have another
layer of encapsulated abstraction: <strong>modules</strong>.</p>
<p>Modules will generally contain a group of procedures, some externally
accessible, and some not. And in non-duck typed languages, they will
generally define a number of aggregate types, again some externally
accessible, and some not. It is generally even possible to expose these
types abstractly, so the existence of a type is accessible to the rest
of the program, but not the record fields, or even the fact that it
is a record type. Even C has this ability in its module system – C++
did not introduce it, just added an additional, orthogonal level of
field-by-field access controls.</p>
<p>Seen from my pragmatic point of view, class-based encapsulation is not
some special insight of OOP, but a specialized – or rather, tightly
restricted – form of module. In an OOP programming language, we have
this notion of a class, which is a special form of module (sometimes the
only supported form, or sometimes even layered underneath a completely
different, more traditional notion of module, for extra confusion). It’s
just that, for a “class,” there can only generally be one primary type
defined, which shares a name with the module itself, and where the fields
of that type are given special protection against access by code outside
the class.</p>
<p>Of course, there are other differences between a class and a module,
but these have to do with the other pillars, and we will get to them
later. For right now, we will just discuss the idea of a “class”
as it relates to encapsulation – where a class is just a special
module with one privileged, abstracted type.</p>
<p>And this is a reasonable way to write a module, but it’s not as special
as object-oriented programming makes it out to be (especially once we
discuss alternative approaches to the other pillars, but again, more
later). There are some situations where a module doesn’t have any record
type that it defines, which is awkward in programming languages like Java,
where you have to define an empty record type anyway and still make a
“class.” There are also situations in which a module defines multiple
publically accessible types that are tightly entangled – and where the
encapsulation between those types that OOP style would encourage you to
do is more of a hinderance than a help.</p>
<p>Fundamentally, being able to hide the fields of a record from other
modules is important, which is why even C supports it. It is even
essential for implementing safe abstractions over unsafe features in
Rust, such as for collections, where raw pointers have invariants in
combination with other fields in the same record. But it is not new to
OOP, and it is simply not the best choice for every possible type.</p>
<p>As evidence of this, in Java and Smalltalk, and to a lesser extent
even in C++ or Python, the insistence on a one-type-per-class style of
encapsulation means that you get these boilerplate methods like <code>setFoo</code>
and <code>getFoo</code>. These methods do nothing but serve as field accessors for
something that is fundamentally a dumb record type. In theory, this helps
you if you want to change what happens when these fields are set or read,
but in practice, the fact that they are raw field accessors is part of
the contract. If they, for example, instead made a network call rather
than just returning a value, that would strongly value the principle of
surprise for such simply named methods.</p>
<p>It is far simpler to say:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">pub</span> <span style="color:#66d9ef">struct</span> <span style="color:#a6e22e">Point</span> {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> x: <span style="color:#66d9ef">f64</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> y: <span style="color:#66d9ef">f64</span>,
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">pub</span> z: <span style="color:#66d9ef">f64</span>,
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>… than the Java idiomatic “JavaBean” equivalent from when I was a
Java programmer (Java has apparently changed since then, but this is
representative of many OOP programming languages including Smalltalk
and many books on how to program):</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-java" data-lang="java"><span style="display:flex;"><span><span style="color:#66d9ef">class</span> <span style="color:#a6e22e">Point</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">private</span> <span style="color:#66d9ef">double</span> x<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">private</span> <span style="color:#66d9ef">double</span> y<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">private</span> <span style="color:#66d9ef">double</span> z<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">double</span> <span style="color:#a6e22e">getX</span><span style="color:#f92672">()</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> x<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">setX</span><span style="color:#f92672">(</span><span style="color:#66d9ef">double</span> x<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">this</span><span style="color:#f92672">.</span><span style="color:#a6e22e">x</span> <span style="color:#f92672">=</span> x<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">double</span> <span style="color:#a6e22e">getY</span><span style="color:#f92672">()</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> y<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">setY</span><span style="color:#f92672">(</span><span style="color:#66d9ef">double</span> y<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">this</span><span style="color:#f92672">.</span><span style="color:#a6e22e">y</span> <span style="color:#f92672">=</span> y<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">double</span> <span style="color:#a6e22e">getZ</span><span style="color:#f92672">()</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">return</span> z<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">void</span> <span style="color:#a6e22e">setZ</span><span style="color:#f92672">(</span><span style="color:#66d9ef">double</span> z<span style="color:#f92672">)</span> <span style="color:#f92672">{</span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">this</span><span style="color:#f92672">.</span><span style="color:#a6e22e">z</span> <span style="color:#f92672">=</span> z<span style="color:#f92672">;</span>
</span></span><span style="display:flex;"><span> <span style="color:#f92672">}</span>
</span></span><span style="display:flex;"><span><span style="color:#f92672">}</span>
</span></span></code></pre></div><p>Such data types generally don’t use any of the other features that OOP
classes get, such as polymorphism or inheritance. To use such features
in such “JavaBean” classes would also violate the principle of least
surprise. The “class” concept is overkill for these record types.</p>
<p>And of course, a Java developer (or Smalltalk, or C#) will say that by
accessing the fields indirectly through these getter and setter methods,
that they are future-proofing the class, in case the design changes (and
in fact I was reminded to add this paragraph when someone on Reddit made
exactly this point). But I find this disingenuous, or at least misguided
– it is often used for structures internal to a portion of the program,
where the far more reasonable thing to do would be to change the fields
openly to all users of the structure. It is also extremely difficult to
think of an unsurprising thing for these methods to do besides literally
set or get a field, as the method name implies – making a network call,
for example, would be a shocking surprise for a <code>get</code> or <code>set</code> method
and therefore a violation of at least the implicit contract. In my time
programming object-oriented programming languages, I never once saw a
situation where it was appropriate for a getter or setter to do anything
but literally get or set the field.</p>
<p>If code does change to require the getter or setter to do something
else, I would rather change the name of the method to reflect what
else it does, rather than pretend that’s somehow not a breaking
change. <code>fetchZFromNetwork</code> or <code>setAndValidateZ</code> seem more appropriate
than a <code>getZ</code> or <code>setZ</code> that does something more than the simple field
access that we assume a setter or getter does. OOP’s insistence that
every type should be its own code abstraction boundary is often absurd
when applied to these lightweight aggregate types. These sorts of getters
and setters are used to protect an abstraction boundary that shouldn’t
exist and just gets in the way, and future-proof against implementation
changes that shouldn’t be made without also changing the interface.</p>
<p>Setters and getters, in short, are an anti-pattern. If you intend to
create an abstraction besides “data structure,” where validation or
network calls or anything else beyond raw field accesses would be appropriate,
then these <code>get</code> and <code>set</code> names are the wrong names for that abstraction.</p>
<p><em><strong>Edit 2023-02-13 to add this paragraph:</strong></em> To be clear, these objections
apply to properties as well. It’s not the syntactic inconvenience that
I object to, but the entire notion that replacing field accesses with
code transparently is a good thing to strive for, or an important
possibility to leave open. I should hope that <code>foo.bar = 3</code> would
never make a network call in Rust! And what if it had to be <code>async</code>?
It should be clear if I’m calling a function. Rust is about explicitness.</p>
<p>The <code>get</code> and <code>set</code> functions, in reality, are only used as wrappers to
satisfy the constraints of object-oriented ideology. The future-proofing
they purportedly provide is an illusion. If you provide “JavaBean” style
types, or types with properties, over an abstraction boundary, you are in
practice just as locked in as if you’d provided raw field access – the
changes you are most likely to want to make to those structures would not
allow shifting the getters and setters to maintain compatibility. Leveraging
this future-proofing is likely to be completely impossible for the changes
you’d want to make, and at best it would involve a horrendous hack.</p>
<p>Rust might seem to be the same as OOP languages in all of this; it
superficially looks like it has something very similar to classes. You
can define functions associated with a given type – and they are even
called methods! Like OOP methods, they syntactically privilege taking
values of that type (or references to those values) as the first argument,
called the special name <code>self</code>. You even mark fields of a record type
(called <code>struct</code> in Rust) as public or (by default) private, encouraging
private fields just like in an object-oriented programming language.</p>
<p>According to this pillar, Rust seems pretty close to being OOP. And
that’s a fair assessment, for this pillar, and an intentional choice
to make Rust programming more comfortable to people used to the everyday syntax
of OOP programming in C++ (or Java, or JavaScript).</p>
<p>But the similarity is only skin-deep. Encapsulation is the least distinct
pillar of OOP (after all, all modern programming languages have some
form of it), and the implementation in Rust is not bound with the type.
When you declare a field private in Rust (by not specifying <code>pub</code>),
that doesn’t mean private to its methods, that means private to the
module. A module can provide multiple types, and any function in that
module, whether a “method” of that type or not, can access all of the
fields defined in that type. Passing around records is encouraged when
appropriate, rather than discouraged to the point that accessors are
forced instead, even in tightly-bound related code.</p>
<p>This is the first sign we see that Rust, in spite of its superficial
syntax, is not an OOP programming language.</p>
<h1 id="future-posts">Future Posts</h1>
<p>And at this point I’m going to have to pause for today.</p>
<p>Of course, encapsulation isn’t the only fancy thing OOP-style classes
can do. If it were, classes wouldn’t have enamored so many people: it
would simply be obvious to everyone that classes were nothing more than
glorified modules, and methods nothing more than glorified procedures.</p>
<p>In the next posts of this series, we will discuss the other features
associated with OOP, the two remaining traditional pillars of OOP,
polymorphism and inheritance, analyze them from a practical point of view,
and see how Rust compares with OOP as it comes to those pillars.</p>
<p>Next up will be polymorphism!</p>
How to Write a JIRA Ticket in ... Relatively Few Stepshttps://www.thecodedmessage.com/posts/jira/2022-10-31T00:00:00+00:00If you’re confused by how to use JIRA effectively, do not worry! If you learn this process, which is very simple not literally impossible, you too can become good at JIRA passingly competent at JIRA not liable to being fired for being bad at JIRA.
Here are the steps:
Create personal TODO item to write JIRA ticket Accumulate requirements for JIRA ticket in personal notes Often more complicated than the feature itself This is the System Working™ Write TODO items strategizing how to: Share the JIRA ticket with other people Connect it properly with other JIRA tickets Advanced: Also epics, projects, or other meta-JIRA constructs Write JIRA ticket Fail to understand what any of the fields are for Oh, they’re required?<p>If you’re confused by how to use JIRA effectively, do not worry!
If you learn this process, which is
<del>very simple</del> not literally impossible, you too can become
<del>good at JIRA</del> <del>passingly competent at JIRA</del> not liable to being
fired for being bad at JIRA.</p>
<p>Here are the steps:</p>
<ul>
<li>Create personal TODO item to write JIRA ticket
<ul>
<li>Accumulate requirements for JIRA ticket in personal notes
<ul>
<li>Often more complicated than the feature itself
<ul>
<li>This is the System Working™</li>
</ul>
</li>
</ul>
</li>
<li>Write TODO items strategizing how to:
<ul>
<li>Share the JIRA ticket with other people</li>
<li>Connect it properly with other JIRA tickets
<ul>
<li>Advanced: Also epics, projects, or other meta-JIRA constructs</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>Write JIRA ticket
<ul>
<li>Fail to understand what any of the fields are for
<ul>
<li>Oh, they’re required?</li>
</ul>
</li>
<li>Ask random people for appropriate values for required fields
<ul>
<li>Sometimes they never get back to you</li>
<li>Or they get back two days from then</li>
<li>In the meantime, forget you were writing a JIRA ticket
<ul>
<li>And then get reminded only by personal TODO list item
<ul>
<li>You did write one of those, right?</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>Curse the names of whoever designed the schema
<ul>
<li>Find out it’s someone you actually liked
<ul>
<li>It made sense at the time
<ul>
<li>No, it cannot be changed now</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>Do follow up connecting JIRA ticket to other people’s JIRA
<ul>
<li>Argue with people about whether JIRA set-up appropriate</li>
<li>Reconcile said arguments</li>
</ul>
</li>
<li>Relitigate everything at next stand-up meeting
<ul>
<li>Potentially go back to beginning to write JIRA ticket again</li>
</ul>
</li>
<li>Be too tired to code anymore
<ul>
<li>What even is code?</li>
</ul>
</li>
</ul>
First Impressions of Asahi Linuxhttps://www.thecodedmessage.com/posts/first-impressions-asahi-linux/2022-10-24T00:00:00+00:00I bought my M1 Mac over a year ago with the intention of installing Asahi Linux on it, but I never got around to it until now. I am still thrilled to be using an ARM workstation made by a major computer manufacturer, and it’s good to be able to run the operating system of my choice on it (though macOS is acceptable for entertainment and video calls, Linux is what I work and do my organization in).<p>I bought my M1 Mac over a year ago with the intention of installing Asahi
Linux on it, but I never got around to it until now. I am still thrilled
to be using an ARM workstation made by a major computer manufacturer,
and it’s good to be able to run the operating system of my choice on it
(though macOS is acceptable for entertainment and video calls, Linux
is what I work and do my <a href="https://www.thecodedmessage.com/tags/organization/">organization</a> in). And I
don’t particularly do GPU-intensive things in my day to day computing –
I run XMonad, of all things! – so I don’t really feel like I’m missing
out by not having a “proper” graphics driver.</p>
<h1 id="installation">Installation</h1>
<p>The Asahi Linux installation process, in spite of some dire
warnings, was relatively friendly. It was a “wizard” process
rather than a series of instructions to run individual commands that
I would have to read off a website.
Wizard is definitely better, because those instruction series almost
always contain mistakes, assumptions of things you’d “obviously” do,
or un-fleshed-out untested alternatives;
<a href="https://nixos.org/manual/nixos/stable/index.html#sec-installation">NixOS</a>
in particular has stolen from me many hours of frustration I’ll never
get back (and hours later of fixing configuration issues that resulted
just from me following instructions from official materials).</p>
<p>So, I guess simply because I’m comparing it to NixOS, Asahi Linux
felt extremely easy to install! I didn’t even mind that there wasn’t a
concrete recommendation for how much space to give each operating system
(although I would have appreciated it). The installer did, however,
do two things that annoyed me.</p>
<p>The <strong>first thing</strong> was that it asked me if I wanted to enter some sort of
an expert mode. It said that the questions it would ask in that mode were
only interesting to developers, and while normally that would be “yes,
absolutely me,” in this case I think they meant “developers <em>of</em> Asahi
Linux” – so, not me. I wanted to say <code>y</code> out of curiosity, but I didn’t
want to actually choose any wrong option and risk bricking my laptop –
which I don’t think were the actual stakes, but I wasn’t entirely sure.</p>
<p>I really hope that if I’d said <code>y</code>, it would have been okay. I would
hope that the default option in each “advanced” prompt would be the same
as what I’d get if I didn’t do advanced options, but I didn’t really
trust them to do that, and it was intimidating.</p>
<p>I’d much rather they said what the advanced options actually did,
and reassured me that you could always go with the pre-set defaults
if you were unsure, rather than just ask me if I wanted to do “expert
mode.”</p>
<p>So that was a little annoying.</p>
<p>The <strong>second thing</strong> that annoyed me was something that the designers
have definitely put some thought into, and I’m befuddled how they arrived
where they did.</p>
<p>So, there is one point where the computer is turned off, and
you must follow the instructions on how to turn the computer
back on very carefully and particularly, or else there be dragons,
because if you don’t boot it into recovery mode for the first boot,
then Linux will never install.</p>
<p>That isn’t the problem. I appreciate them communicating the stakes,
and communicating how it works. I’m sure it’s not their fault that
you have to do this extra step, but rather something to do with
how the M1’s firmware work. However, I am befuddled why they provide the
instructions in the most detail on the laptop where you’re currently
installing it – you know, a screen that’s immediately going to
disappear as soon as you turn the computer off. There were 7 steps!</p>
<p>It appears that I was expected to:</p>
<ul>
<li>Read all 7 steps very carefully</li>
<li>Memorize them (carefully!) and remember them when I turned the computer
back on</li>
</ul>
<p>Now, I have ADHD, so my short-term perspective memory is <em>very</em> poor.
There’s also a high chance that I’ll get distracted while the computer’s
off, and will have to come back to the turning-it-on step later.
But even a neurotypical person can’t be expected to reliably remember
how to do <em>7</em> steps <em>carefully</em>.</p>
<p>I took a picture of the instructions with my phone. I think they
should have:</p>
<ul>
<li>Suggested writing down or taking a picture of the instructions, because
“careful” is likely not good enough for many people.</li>
<li>Included all 7 instructions in detail on the website, so if you fail
to write it down, you get more than this condensced summary:</li>
</ul>
<blockquote>
<p>Once the first stage of the installation is done, you will have to reboot
into 1TR mode (One True recoveryOS) in order to finish the install. Read
the instructions that the installer prints carefully! Simply rebooting
into the new OS won’t work until this is done. You need to fully shut
down your machine, then boot by holding down the power button until
you see “Entering startup options”, choose your new OS in the boot
selector menu, and follow the prompts.</p>
</blockquote>
<p>The website references the transient “instructions that the installer
prints.” If anything, the installer should direct you to the website,
which should give the instructions in equal detail to how the
installer gives them:</p>
<p><img src="https://www.thecodedmessage.com/instructions.jpg" alt="Reboot Instructions"></p>
<p>In any case, what I actually did was panic, close the laptop,
panic again, open it again, realize that made it turn on, and held
down the power button – which worked, in spite of blatantly violating
the instructions. So maybe warn people not to close and then re-open
the laptop, while you’re at it?</p>
<p>… Perhaps it’s moves like this that prevented me from installing
NixOS correctly, where they just kind of assume you wouldn’t do
something that dumb.</p>
<h1 id="first-boot">First Boot</h1>
<p>I haven’t dual booted a computer since I lived with my parents, and either
had to share a computer with them (my Linux partition and their Windows)
or later when I only had one computer that I could use in full privacy,
but needed both Linux and a more “normal” OS – thus an iBook which ran
Mac OS and a PowerPC version of Ubuntu. Even when I ran FreeBSD and other
out-there OSes, I had a dedicated (old) full-tower desktop to run it from.</p>
<p>So the idea of dual-booting a “normal” OS that comes with the computer
and the more “edgy” programmer-friendly OS that is Linux is quite
nostalgic for me. I wondered whether there was any way to refer to macOS
with capitalism-criticizing character substitutions a la Mi¢ro$oft:
maybe macO$? And to be honest, I was even a little nervous that my
IT/sysadmin skills had rotted a little bit since I was a kid. Even though
this installer was bending over backwards to make everything easy,
this was an alpha operating system unsupported by the workstation
vendor.</p>
<p>But all went well.</p>
<p>Once you have it installed, the computer boots into Asahi Linux. You have to
hold down the power button to get the boot menu – it uses a firmware-based
boot manager to distinguish macOS and Linux. This is a little annoying,
as I prefer being asked what operating system I want to boot every time
in a dual boot set up, but I can deal with it.</p>
<p>The first boot requires a few remaining set up steps to select keyboard
layout, language, and time zone, and also to name the computer and set up
a default user. I named the computer <code>protectorate</code> as part of my
forms-of-government naming scheme (my Dell laptop is <code>palatinate</code>),
and in reference to that this is Linux acting in somewhat foreign territory,
claimed by another Unix.</p>
<p>Once this set-up had been complete, I turned on Wi-Fi, which to my mild
surprise worked immediately and like a charm, from the KDE-based graphical
WiFi menu.</p>
<p>I mean, in all honesty, I kind of knew it would work the first time –
that was the point – but I was still viscerally surprised. I guess I
am used to the idea of getting Linux to run on a “new” or “odd” platform
being an issue of chasing down driver after driver, so I’m happy that
I have a distribution designed for basically exactly the computer that
I have, even if it’s not a computer particularly associated with Linux.</p>
<p>Then, as soon as I’d verified that Linux worked, I very nervously
rebooted the whole thing into macOS – which also worked. Yay!</p>
<h1 id="next-steps">Next Steps</h1>
<p>So that’s where I am now.</p>
<p>To get my normal Linux workflow set up, I’m going to need XMonad and
Dropbox. This should be interesting, as I understand neither of those
things are Arch Linux packages on ARM, and Dropbox isn’t supported
on Linux ARM at all (though you can maybe use their APIs directly to
implement a janky home version?)</p>
<p>So, when I get that all set up, I will let you know in another post!</p>
<p>Pictures will come with the next blog post.</p>
<p>I make no promises as to schedule.</p>
RAII: Compile-Time Memory Management in C++ and Rusthttps://www.thecodedmessage.com/posts/raii/2022-10-11T00:00:00+00:00I don’t want you to think of me as a hater of C++. In spite of the fact that I’ve been writing a Rust vs C++ blog series in Rust’s favor (in which this post is the latest installment), I am very aware that Rust as it exists would never have been possible without C++. Like all new technology and science, Rust stands on the shoulders of giants, and many of those giants contributed to C++.<p>I don’t want you to think of me as a hater of C++. In spite of the fact
that I’ve been writing a <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">Rust vs C++ blog series</a>
in Rust’s favor (in which this post is the latest installment), I am very
aware that Rust as it exists would never have been possible without C++.
Like all new technology and science, Rust stands on the shoulders of
giants, and many of those giants contributed to C++.</p>
<p>And this makes sense if you think about it. Rust and C++ have very similar
goals. The C++ community has done a lot over all these years to pioneer
new programming language features in line with those goals. C++ has then
given these features years to mature in its humongous ecosystem. And
because Rust also doesn’t have to be compatible with C++, it can then
steal those features without some of the caveats they come with in C++.</p>
<p>One of the biggest such features – perhaps the biggest one – is
RAII, C++’s and now Rust’s (somewhat oddly-named) scope-based feature
for resource management. And while RAII is for managing all kinds of
resources, its biggest use case is as part of a compile-time alternative
to run-time garbage collection and reference counting.</p>
<p>As an alternative to garbage collection, RAII has deficits.
While many allocations are created and freed neatly in line with
variables coming in and out of scope, sometimes that’s not
possible. To fully compete with garbage collection and
capture the diverse ways programs use the heap, RAII
needs to be combined with other features.</p>
<p>And C++ has done a lot of this. C++ added move semantics in C++11,
which Rust also has – though cleaner in Rust because Rust was
designed with them from the start and so it can pull off <a href="https://www.thecodedmessage.com/tags/cpp-move/">destructive
moves</a>. C++ also has opt-in reference counting, which,
again, Rust also has.</p>
<p>But C++ still doesn’t have lifetimes (Rust got that from
<a href="https://homes.cs.washington.edu/~djg/papers/cyclone.pdf">Cyclone</a>,
which called them “regions”), nor the infamous borrow checker that goes
along with them in Rust. And even though the borrow checker is perhaps
the most hated part of Rust, in this post, I will argue that it brings
Rust’s RAII-centric compile-time memory management system much closer
to feature-parity with run-time reference counting and other run-time
garbage-collection technologies.</p>
<p>I will start by talking about the problem that RAII was originally
designed to solve. Then, I will re-hash the basics of how RAII works,
and work through memory usage patterns where RAII needs to be combined
with these other features, especially the borrow checker. Finally, I will
discuss the downsides of these memory management techniques, especially
performance implications and handling of cyclic data structures.</p>
<p>But before I get into the weeds, I have some important caveats:</p>
<blockquote>
<p><em>Caveat</em>: No Turing-complete programming language can completely
prevent memory leaks. Even in fully-GC’d languages, you can still
leak memory by filling up a data structure with increasing
amounts of unnecessary data. This can be done by accident,
especially when sophisticated callback systems are combined
with closures. This is out of the scope of this post, which
only concerns memory management issues that automated GC
can actually help with.</p>
<p><em>Caveat #2</em>: Rust allows you to <a href="https://doc.rust-lang.org/std/boxed/struct.Box.html#method.leak">leak memory on
purpose</a>,
even when a garbage collector would have reclaimed it. In extreme
circumstances, the reference counting system can be abused to leak
memory as well. This fact has been used in anti-Rust rhetoric to
imply its memory safety system is somehow worthless.</p>
<p>For the purposes of this post, we assume a programmer who is
trying to get actual work done and needs help not leaking memory
or causing memory corruption, not an adversarial programmer
trying to make the system leak on purpose.</p>
<p><em>Caveat #3</em>: RAII is a terrible name. OBRM (Ownership-Based Resource
Management) is used in Rust sometimes, and is a much better name.
I call it RAII in this article though, because that’s what most people
call it, even in Rust.</p>
</blockquote>
<h1 id="the-problem-manual-memory-management-is-hard-gc-is-slow">The Problem: Manual Memory Management is Hard, GC is “Slow”</h1>
<blockquote>
<p>Caveat: To be clear, “slow” here is an oversimplification, and I address
that more later. I mean it as a tongue-in-cheek way of saying that
it has performance costs, whereas Rust and C++ try to adhere
to a <em>zero</em>-cost principle.</p>
</blockquote>
<p>So. C-style manual memory management – “just call <code>free</code> when you’re
done with the allocation” – is error prone.</p>
<p>It is error prone when it is easy and tedious, because programmers can
make stupid mistakes and just forget to write <code>free</code> and it isn’t
immediately broken. It is error prone when multiple programmers work
together, because they might make different assumptions about who is
supposed to free something. It is error prone when multiple parts of the
code need to use the same data, especially when that usage changes with
new requirements and new features.</p>
<p>And the consequences of doing it wrong are not just memory
leaks. Use-after-free can lead to memory corruption, and bugs in one
part of the program can abruptly show up when allocation patterns change
somewhere else entirely.</p>
<p>This is a problem that can be solved with discipline, but like many
tedious clerical disciplines, it can also be solved by computer.</p>
<p>It can be solved at run-time, which is what garbage collection and
reference counting do. These systems do two things:</p>
<ul>
<li>They keep allocations from lasting too long. When memory becomes
unreachable, it can be reclaimed. This prevents memory leaks.</li>
<li>They keep allocations from being freed early. If memory is still
reachable, it will still be valid. This prevents memory corruption.</li>
</ul>
<p>And for most programmers and applications, this is good enough. And so
for almost all modern programming languages, this run-time cost is well
worth not troubling the programmer with the error-prone tedious tasks
of C-style manual memory management, enabling memory safety and resource
efficiency at the same time.</p>
<h2 id="gc-including-rc-has-costs">GC (including RC) Has Costs</h2>
<p>But there are costs to having the computer do memory management at
run-time.</p>
<p>I lump mark-sweep garbage collection and reference counting together here.
Both mark-sweep garbage collection and reference counting have costs above
C-style manual memory management that make them unacceptable according to
the zero-cost principle. GC comes with pauses, and additional threads,
in the best case. RC comes with myriad increments and decrements to a
reference count. These costs might be small enough to be okay for your
application – and that’s well and good – but they are costs, and
therefore they can’t be the main memory management model in C++ or Rust.</p>
<p>This is a complicated issue, and so before continuing, here comes
another caveat:</p>
<blockquote>
<p>Caveat: GC is not <em>necessarily</em> slower, but it does have performance
implications that are often unacceptable for situations where C++
(or Rust) is used. To achieve its full performance, it needs to be
enabled for the entire heap, and that has costs associated with it.
For these reasons, C++ and Rust do not use GC. The details of these
performance trade-offs are beyond the scope of this blog post.</p>
</blockquote>
<h2 id="a-dilemma">A Dilemma</h2>
<p>But C++ and Rust are not most programming languages. They face a dilemma:</p>
<ul>
<li>On the one hand, manual memory management
is unacceptably error prone for a high level language, a detail the
computer should be able to handle for you.</li>
<li>On the other hand, run-time garbage collection violates a fundamental
goal that C++ and Rust share: the zero-cost principle. Code written
in these languages is supposed to be as performant as the equivalent
manually-written C. To conform to that principle, reference counting
(or GC) have to be opt-in (because, after all, sometimes manually written
C code does use these technologies).</li>
</ul>
<p>So, for the vast majority of situations, where a C programmer wouldn’t
use reference counting (or mark-sweep), Rust and C++ need something more
sophisticated. They need tools to prevent memory management mistakes –
that is, to at least partially automate this tedious and error-prone
task – without sacrificing any run-time performance.</p>
<p>And this is the reason C++ invented (and Rust appropriated) RAII. Instead
of addressing the problem at run-time, RAII automates memory management
at compile-time. Analogous to how templates and trait monomorphization
can bring some but not all of the power of polymorphism without many of
the run-time costs, RAII brings some but not all of the power of garbage
collection without constant reference count updates or GC pauses.</p>
<p>But as we will see, RAII as C++ implements it only solves one of the
two problems addressed by garbage collection: leaks. It cannot address
memory corruption; it cannot keep allocations alive long enough for all
the code that could possibly need to use it.</p>
<h1 id="raw-raii-how-raii-works-on-its-own">Raw RAII: How RAII Works on its Own</h1>
<p>The simplest use case for RAII is underwhelming: it automatically inserts
calls to free up heap allocations at the end of the block where
we made the allocation. It replaces a <code>malloc</code>/<code>free</code> sandwich
from C with simply the allocation side, by inserting an
implicit (and unwritten) call to a destructor, which in its
simplest version is an equivalent of <code>free</code>. And if that
was all RAII did, it wouldn’t be that interesting.</p>
<p>For example, take this C-style (no RAII) code:</p>
<pre tabindex="0"><code>void print_int_little_endian_decimal(int foo) {
// Little endian decimal print of `foo`
// i.e. backwards from how we normally write decimal numbers
// e.g. 831 prints out as "138"
// Big endian would be too hard
// Little endian is as always actually simpler platonically,
// if somehow not for humans.
// Yes, this only works for positive ints. It's an example.
char *buffer = malloc(11);
for(char *it = buffer; it < buffer + 10; ++it) {
*it = '0' + foo % 10;
foo /= 10;
if (foo == 0) {
it[1] = '\0';
break;
}
}
puts(buffer); // put-string, not the 3sg verb form "puts"
free(buffer); // Don't forget to do this!
}
</code></pre><p>Just using RAII (and <code>unique_ptr</code>s, which are an essential part of
the RAII model), but using no other features of C++, we get this very
unidiomatic and unimpressive version:</p>
<pre tabindex="0"><code>void print_int_little_endian_decimal(int foo) {
std::unique_ptr<char[]> buffer{new char[11]};
for(char *it = &buffer[0]; it < &buffer[10]; ++it) {
*it = '0' + foo % 10;
foo /= 10;
if (foo == 0) {
it[1] = '\0';
break;
}
}
puts(&buffer[0]);
}
</code></pre><p>It doesn’t help us with our random guess of an appropriate buffer
size, our awkward redundant attempts to avoid a buffer-overflow, or
with any abstraction over the fact that we’re trying to implement
a collection.</p>
<p>In fact, it makes the code more awkward, for a benefit that seems hardly
worth it, to just automatically call <code>free</code> at the end of the block –
which might not even be where we want to call free! We could instead
have wanted to return the data to the caller, or inserted it into
a bigger, greater data structure, or similar.</p>
<p>It’s a bit less ugly when you use C++’s abstractions. Destructors
don’t have to just call <code>free</code> (or rather its C++ analogue <code>delete</code>) as
<code>unique_ptr</code>’s does. Any C programmer can tell you that idiomatic
C code is rife with custom free functions to free all of the allocations
of a data structure, and C++ (and Rust) will choose which destructor
to call for you based on the type of the data. Calling <code>free</code> when
a custom destructor must be called is a common careless mistake in C.
This is true especially among beginners, and (hot take!) making
programming languages less needlessly tricky for beginners is a good
thing for everybody.</p>
<p>We can combine RAII with other features of C++ to get this more
idiomatic code, with the first <code>do</code>-<code>while</code> loop I’ve written
in years:</p>
<pre tabindex="0"><code>void print_int_little_endian_decimal(int foo) {
std::string res;
do {
res += '0' + foo % 10;
foo /= 10;
} while (foo != 0);
std::cout << res << std::endl;
}
</code></pre><p>Does <code>std::string</code> allocate memory on the heap? Maybe it only does
if the string goes above a certain size. But the custom destructor,
<code>~std::string</code>, will call <code>delete[]</code> only when the allocation was actually
made, abstracting that question away, along with handling terminating
nuls and avoiding overruns in a cleaner way.</p>
<p>This ability of RAII – to call custom destructors that abstract away
allocation decisions – gets more impressive when we consider that
many data structures don’t make just 0 or 1 heap allocations, but whole
complicated trees of complicated heap allocations. In many cases, C++
(and Rust) will write your destructors for you, even for complicated
types like this:</p>
<pre tabindex="0"><code>struct PersonRecord {
std::string name;
uint64_t salary;
};
std::unordered_map<std::string, std::vector<PersonRecord>> thing;
</code></pre><p>To destroy <code>thing</code> in C, you’d have to loop through the hash map, free
all the keys, and then free all the values, which then requires freeing
all the strings in each <code>PersonRecord</code> before freeing the backing for
each vector. Only then could you free the actual allocations backing the
hash map.</p>
<p>And perhaps a C-based hash map library could do this for you, but only
by assuming that the keys are strings, and then taking a function
pointer to know how to free the values, which would ironically
be a form of dynamic polymorphism and therefore a performance hit. And
the function to free the values would then still have to manually free the
string, knowing which field of the <code>PersonRecord</code> was a pointer and
duplicating that information between the structure and the manually-written
“free” function, and still likely not supporting the small-string
optimization that C++ enables.</p>
<p>In C++, this freeing code is all automatically generated. <code>PersonRecord</code>
gets an automatic destructor that calls the destructor of each
field (<code>int</code>’s destructor is trivial), and the destructors of
<code>std::unordered_map</code> and <code>std::vector</code> are templated so that, at compile
time, a fresh destructor is built from those templates that handles all
of this, all without any indirect function calls or run-time cost beyond
what manually would be written for exactly this data structure in C.</p>
<p>See, with RAII, a destructor isn’t just automatically and implicitly
called at the end of a scope in a function, but also in the destructors
of values (“objects” in C++) that <em>own</em> other values. Even if you
do write a custom destructor for aggregate types, that just specifies
what the computer should do on destruction <em>beyond</em> the automatic
calls to the destructors of the fields, which are still implicit.</p>
<h1 id="ownership-and-its-limitations">Ownership and its limitations</h1>
<p>This is all possible based on the concept of “ownership,” one of the key
principles of RAII. The key assumption is that every allocation has one
owner at any given time. Allocations can own each other (forming a tree
of allocations), or a scope can own an allocation (forming the root of
such a tree). RAII then can make sure the allocation ends when its owner
does – by the scope exiting, or when the owning object is destroyed.</p>
<p>But what if the allocation needs to outlive its parent, or its scope?
It’s not always the case that a function has primitive types as its
arguments and return value, and then only constructs trees of allocations
privately. We need to take these sophisticated collections and pass
them as arguments to functions. We need to have them be returned from
functions.</p>
<p>This becomes apparent if we try to refactor our
big-endian integer decimalizer to allow us to do other things with
the resultant string besides print it:</p>
<pre tabindex="0"><code>std::string render_int_little_endian_decimal(int foo) {
std::string res;
do {
res += '0' + foo % 10;
foo /= 10;
} while (foo != 0);
return res;
}
int main() {
std::cout << render_int_little_endian_decimal(3781) << std::endl;
return 0;
}
</code></pre><p>Based on our previous discussion of RAII, you might assume that
the <code>~std::string</code> destructor is called on the end of its scope,
rendering the allocation unusable for later printing, but instead
this code “Just Works.”</p>
<p>We’ve hit one of many mitigations against the limitations of raw
RAII that are necessary for it to work. This mitigation is the “Named
Return Value Optimization (NRVO),” which stipulates that if a named
variable is used in all of the <code>return</code> statements in a function, it is
actually constructed (and destructed) in the context of the caller. It is
misnamed an “optimization” because it’s actually part of the semantics:
It eliminates entirely the call to the destructor at the end of the scope,
even if that destructor call would have side effects.</p>
<p>This is just one of many ways RAII is made competitive with run-time
garbage collection, and we can have values that live outside of
a certain scope of a function. This one is narrow and peculiar to C++,
but many of the others lead to interesting comparisons. In the next
section, we discuss the others.</p>
<h1 id="filling-the-gaps-in-raii">Filling the Gaps in RAII</h1>
<h2 id="copyingcloning">Copying/Cloning</h2>
<p>We’re going to start with one of the oldest of these: copying.
When C++ was designed, the intention was that the programmer would
not see a difference between types that don’t involve allocation
(like <code>int</code> or <code>double</code>) and types that do (like <code>std::string</code> or
<code>std::unordered_map<std::string, std::vector<std::string>></code>.</p>
<p>When a function takes an <code>int</code> argument, as in
<code>print_int_little_endian_decimal</code>, that integer is copied.
Similarly, if we take a <code>std::string</code> argument without additional
annotation, C++ will also make a copy:</p>
<pre tabindex="0"><code>int parse_int_le(std::string foo) {
int res = 0;
int pos = 1;
for (char c: foo) {
res += (c - '0') * pos; // No input validation -- example!
pos *= 10;
}
return res;
}
int main(int argc, char **argv) {
std::string s = argv[1];
std::cout << parse_int_le(s) << std::endl;
return 0;
}
</code></pre><p>This is indeed consistent. Treating <code>int</code>s and <code>std::string</code> objects
in parallel ways is also in line with how higher-level programming
languages sometimes work: a string is a value, an <code>int</code> is a value,
why not give them the same semantics? Aliasing is confusing, why not
avoid it with copying?</p>
<p>It’s made to work by an implicit function call. Just like destructor
calls are implicit in C++, copying also calls a function in the types
implementation. Here, it calls <code>std::string</code>’s “copy constructor.”</p>
<p>The problem here is that this is slow. Not only is an unnecessary
copy made, but an unnecessary allocation and deallocation creep in.
There is no reason not to use the same allocation the caller already
has, here in <code>s</code> from the <code>main</code> function. A C programmer would never
write this copying version.</p>
<p>The only reason this feature is allowed under C++’s zero-cost
principle is because it is optional. It may be the default – and
making it the default is one of the most questionable decisions C++
ever made – but we can still alias if we want to. It just takes
more work.</p>
<p>Rust, as you can guess by my tone, requires explicit annotation to
copy types that have an allocation. In fact, Rust doesn’t even use the
term “copy,” which is reserved for types that can be copied without
allocations. It calls this cloning, and requires use of the <code>clone()</code>
method to accomplish it.</p>
<p>Some types don’t use an allocation, and “copying” them is just a simple
memory copy. Some types do use an allocation, and “cloning” them requires
allocating. This distinction is important and fundamental to how computers
work. It’s relevant and visible in Java and even Python, and pretending
it doesn’t exist is unbecoming for a systems programming language like
C++.</p>
<h2 id="moves">Moves</h2>
<p>Returning an allocation from a function can’t always use NRVO. So if
you want your value to outlast your function, but it’s created inside
the function (and therefore “owned” by the function scope), what you
really need is a way for the value to change owners. You need to
be able to move the value from the scope into the caller’s scope.
Similarly, if you have a value in a vector, and need to remove the
last value, you can move it.</p>
<p>This is distinct from copying, because, well, no copy is made –
the allocation just stays the same. The allocation is “moved” because
the previous scope no longer has responsibility for destroying the allocation,
and the new scope gains the responsibility.</p>
<p>Move semantics fix the most serious issue with RAII: your allocation
might not live exactly as long as its owner. The root of an allocation
tree might outlive the stack-based scope it’s in, such as when you want to
return a collection from a function. The other nodes of an allocation tree
might leave that tree and be owned by another stack frame, or by another
part of the same allocation tree, or by a different allocation tree. In
general, “each allocation has a unique owner” becomes “each allocation
has a unique owner at any given time,” which is much more flexible.</p>
<p>In Rust, this is done via “destructive moves,” which oddly enough
means <em>not</em> calling the destructor on the moved-from value. In fact,
the moved-from value ceases to be a value when it’s moved from, and
accessing that variable is no longer permitted. The destructor is
then called as normal in the place where the value is moved to. This
is tracked statically at compile-time in the vast majority of situations,
and when it cannot be, an extra boolean is inserted as a <a href="https://doc.rust-lang.org/nomicon/drop-flags.html">“drop
flag”</a> (“drop” is
how Rust refers to its destructors).</p>
<p>C++ didn’t add move semantics until C++11; it was not part of the
original RAII scheme. This is surprising given
how essential moves are to RAII. Returning collections
from functions is super important, and you can’t copy every time.
But before C++, there were only poor man’s special cases for move,
like NRVO and the related RVO for objects constructed in the return
statement itself. These have completely different semantics than C++
move semantics – they’re still more efficient than C++ moves in
many cases.</p>
<p>When C++ did eventually add moves, the other established semantics of C++
forced it to add moves in a weird and deeply confusing way: it added
“non-destructive” moves. In C++, rather than the drop flag being a flag
inserted by the compiler, it is internal to the value. Every type that
supports moves must have a special “empty state,” because the destructor
is called on the moved-from value. If the allocation had moved to
another value, there would be no allocation to free, and this had
to be handled by the destructor at run-time, which can amount to
a violation of the zero-cost principle in some situations.</p>
<p>C++ justifies this by making moves a special case of copy. Moves
are said to be like copies, but make no promises of preserving the
initial value. In exchange, you might get the optimization of being
able to use the original allocation, but then the initial value will
not have an allocation, and will be forced to be different. This
definition is very different than what moves are actually used for
(cf. the name of the operation), and therefore, even though it is
<a href="https://herbsutter.com/2020/02/17/move-simply/">technically simple</a>,
claiming that focusing on that definition (as Herb Sutter does) will
simplify things for the programmer is disingenuous, as I discuss in my
post on <a href="https://www.thecodedmessage.com/posts/cpp-move/">move semantics</a>.</p>
<p>In practice, this means that all types support the operation of moving –
even <code>int</code>s – but even some types that manage an allocation might fall
back on copying if moves haven’t been implemented for them. This
inconsistency, like all inconsistencies, is bad for programmers.</p>
<p>In practice, this also means that moved-from objects are a problem.
A moved-from object might stay the same, if no moving was done. It
might also change in value, if the move caused an allocation
(or other resource) to move into the new object. This forces
C++ smart pointers to choose between movability and non-nullability –
no moveable, non-nullable pointer is possible in C++. Nulls – and the
other “moved-from” empty collections that you get from C++ move
semantics – can then be referenced later on in the function,
and though they must be “valid” values of the object, they are
probably not the values you expect, and in the case of null pointers,
they are famously difficult values to reason about.</p>
<p>This is a consequence of the fact that C++ was a pioneer of RAII
semantics, and didn’t design RAII and moves together from the
start. Rust has the advantage of having included moves from
the beginning, and so Rust move semantics are much cleaner.</p>
<p>In Rust also, all types can be moved. But in Rust, no resources or
allocations are ever copied. Instead, moves always have the same
implementation: copy the memory that is stored in-line in the value
itself, and then do not call the destructor. For copyable types like
<code>int</code> that do not manage an allocation or other resource, this does
amount to a copy, but the original is still not usable. But no allocation
or resource is ever copied; for those types, the pointer or handle is
simply brought along bit-by-bit just like other data, and the old value
is never touched again, making this a safe operation.</p>
<p>All types must then be written in such a way to assume that
values might not stay in the same place in memory. If some operations on
a type can’t be written that way, they can be defined on “pinned”
versions of that type. A pin is a type of reference or box that
promises that the pointed-to value will never move again. The underlying
type is still movable, but these particular values are not.</p>
<p>This is a gnarly exception to Rust’s “all types can be moved” rule that
make it false in practice, though still true in pedantic, language-lawyery
theory. But that’s not important. What is important is that Rust’s
move semantics are consistent, and do not rely on move constructors
and manual implementations of Rust’s drop flags within the object. The
dangerous possibility of interacting with a moved-from object, whose value
is unpredictable and quite possibly a special “empty” state like null,
is not present in Rust.</p>
<h2 id="borrows-in-rust">Borrows in Rust</h2>
<p>While moves cover returning a collection (or other resource-managing
value) from a function, they don’t cover passing such a value into a
function, or at least not in the general case. Sometimes, when we pass
a value into a function, we want to move the value in, so that the
function can consume it or add it to an allocation tree (like inserting
into a collection). But most times, we want the function to be able
to see and perhaps mutate it, but then we want to give it back to
the owner.</p>
<p>Enter the borrow.</p>
<p>In Rust, borrows are commonly introduced as a sort of an improvement on
moves. Consider our example function that parses a string to an <code>int</code>,
here implemented in C++ with copies:</p>
<pre tabindex="0"><code>int parse_int_le(std::string foo) {
int res = 0;
int pos = 1;
for (char c: foo) {
res += (c - '0') * pos; // No input validation -- example!
pos *= 10;
}
return res;
}
</code></pre><p>Here is a Rust version, with moves, so that the function consumes
the string:</p>
<pre tabindex="0"><code>use std::env::args;
fn parse_int_le(foo: String) -> u32 {
let mut res = 0;
let mut pos = 1;
for c in foo.chars() {
res += (c as u32 - '0' as u32) * pos;
pos *= 10;
}
res
}
fn main() {
let mut args: Vec<String> = args().collect();
println!("{}", parse_int_le(args.remove(1)));
}
</code></pre><p>As we can see with the “move” version of this, we are in the awkward
position of removing the string from the vector, so that <code>parse_int_le</code>
can consume the string, so it doesn’t have multiple owners.</p>
<p>But <code>parse_int_le</code> doesn’t need to own the string. In fact, it could
be written so that it can give the string back when it’s done:</p>
<pre tabindex="0"><code>fn parse_int_le(foo: String) -> (u32, String) {
let mut res = 0;
let mut pos = 1;
for c in foo.chars() {
res += (c as u32 - '0' as u32) * pos;
pos *= 10;
}
(res, foo)
}
</code></pre><p>“Taking temporary ownership” in real life is also known as
borrowing, and Rust has such a feature built-in. It is more powerful
than the above code that literally takes temporary ownership, though.
That code would have to remove the string from the vector and then
put it back – which is even more inefficient than just removing it.
Rust borrowing allows you to borrow it even while it’s inside
the vector, and stays inside the vector. This is implemented by
a Rust reference, which has this borrowing semantics, and is,
like most “references,” implemented as a pointer at the machine level.</p>
<p>In order to accomplish these semantics, Rust has its infamous borrow
checker. While we are borrowing something inside the vector, we can’t
simultaneously be mutating the vector, which could cause the thing we’re
borrowing to move. Rust statically ensures that this is impossible,
rejecting code that use a reference after a mutation, destruction,
or move somewhere else would invalidate it.</p>
<p>This enables us to extend the RAII-based system and both prevent
leaks and maintain safety, just like a GC or RC-based system. The
borrow checker is essential to doing so.</p>
<p>For completeness, here is the idiomatic way to handle the
parameter in <code>parse_int_le</code>, with an actual borrow, using
<code>&str</code>, the special borrowed form of <code>String</code> that also allows
slices:</p>
<pre tabindex="0"><code>use std::env::args;
fn parse_int_le(foo: &str) -> u32 {
let mut res = 0;
let mut pos = 1;
for c in foo.chars() {
res += (c as u32 - '0' as u32) * pos;
pos *= 10;
}
res
}
fn main() {
let args: Vec<String> = args().collect();
println!("{}", parse_int_le(&args[1]));
}
</code></pre><h2 id="dodging-memory-safety-in-c">Dodging memory safety in C++</h2>
<p>In C++, of course, there is no borrow checker. In the <code>parse_int_le</code>
example, it’s still possible to use a pointer, or a reference, but then
you’re on your own. When RAII-based code frees your allocation, your
reference is invalidated, which means it’s undefined behavior to use it.
No coordination is performed by the compiler between the RAII/move
system and your references, which point into the ownership tree
with no guarantee that said tree won’t move underneath it.
This can lead to memory corruption bugs, with security implications.</p>
<p>It’s not just pointers and references. Other types that contain
references, such as iterators, can also be invalidated. Sometimes
those are more insidious because intermediate C++ programmers might
know about pointer invalidation, but let their guard down with
iterators. If you add to a vector while looping through it,
you’ve just done undefined behavior, and that’s surprising because
no pointers or references even have to show up. Rust’s borrow
checker handles these as well.</p>
<p>Even though the Rust borrow checker gets a bad reputation, its safety
guarantees often make it worth it. It’s hard to write correct C++ when
references and non-owning pointers are involved. Maybe some of you
have that skill, and are unsympathetic to those who don’t yet have it,
but it is a specialized skill, and the compiler can do a lot of the work
for you, by checking your work. Automation is a good thing, and so is
making systems programming more accessible to beginners.</p>
<p>And of course, many C++ programmers do make mistakes. Even if it’s not
you, it might be one of your colleagues, and then you’ll have to clean
up the mess. Rust addresses this, and limits this more difficult mode
of thinking to writing unsafe code, which can be contained in modules.</p>
<h2 id="multiple-ownership">Multiple Ownership</h2>
<p>In RAII, an allocation has one owner at a time, and if your owner is destroyed
before the allocation is moved to another owner, the allocation must be
destroyed along with it.</p>
<p>Of course, sometimes this isn’t how your allocations work. Sometimes they need
to live until both of two parent allocations are destroyed, and sometimes
there is no way to predict which parent is destroyed first. Sometimes,
the only way to solve that situation – even in C – is to use runtime
information – and so you can model <em>multiple ownership</em> through reference
counting: <code>std::shared_ptr</code> in C++, or <code>Rc</code> and <code>Arc</code> in Rust (depending
on whether it is shared between multiple threads).</p>
<p>This is something that C programmers will sometimes do in the face
of complicated allocation DAGs, and end up implementing bespoke on a
framework-by-framework basis (cf. GTK+ and other C GUI frameworks).
C++ and Rust are just standardizing the implementation of this, but, in
line with the zero-cost rule, making it optional.</p>
<p>Interestingly enough, reference counting is implemented in terms of
RAII and moves. The destructor for a reference-counted pointer decreases
the reference, and cloning/copying such a pointer increases it. Moves,
of course, don’t change it at all.</p>
<h1 id="raii-what-this-all-adds-up-to">RAII+: What this all adds up to</h1>
<p>Between RAII, moves, reference counting, and the borrow checker, we now
have the memory management system of safe Rust. Safe Rust is a powerful
programming language, and in it, you can write programs almost as easily
as in a traditionally GC’d programming language like Java, but
get the performance of manually written, manually memory managed C.</p>
<p>The cost is annotation. In Java, there is no distinction between
“borrowing” and “owning”, even though sometimes the code follows
similar structures as if there were. In Rust, the compiler must
be informed about the chain of owners, and about borrowers.
Every time an allocation crosses scope boundaries or is referred
to inside another allocation, you must write different syntax
to tell Rust whether it’s a move or a borrow, and it must
comply with the rules of the borrow checker.</p>
<p>But it turns out most code has a natural progression of owners,
and most borrows are valid in the borrow checker. When they’re not,
it’s usually straight-forward to rethink the code so that it can
work that way, and the resultant code is usually cleaner anyway.
And in situations where neither of them work, reference counting
is still an option.</p>
<p>At the cost of this annotation, Rust gives you everything a GC
does: Allocations are freed when their handles go out of scope,
and memory safety is still guaranteed, because the annotations
are checked. Memory leaks are as difficult as in a reference
counting language, and the annotations are checked, which is
most of the benefit of automating them. It’s an excellent
happy medium between manual memory management and full run-time
GC with no run-time cost over a certain discipline of C memory
management.</p>
<p>Of course, other disciplines of C memory management are
possible. And using this Rust system takes away flexibility
that might be relevant to performance. Rust, like C++, allows
you to sidestep the “compile-time GC” and use raw pointers,
and that can often be better for performance. <a href="https://matklad.github.io/2022/10/06/hard-mode-rust.html">A recent blog
post</a> I read
explores some of that in more detail; encouragingly, that blog
post also considers RAII to be in-between manual memory management
and run-time GC – serendipitously, because I had already drafted
much of this post when it came out.</p>
<p>But the standard memory management tools of Rust cover the common
cases well, and unsafe is available for when it’s inappropriate –
and can be wrapped in abstractions for interfacing with code that
uses the RAII-based system.</p>
<p>In C++, the annotations of “borrows” vs “moves” can easily result
in undefined behavior. Leaks are prevented, but memory corruption is
not. So the C++ system is a much worse replacement for garbage collection
– RAII is only doing some of its job, as it is not paired with
a borrow checker.</p>
<h1 id="cycles">Cycles</h1>
<p>I leave the most awkward topic for the end. We’ve talked about allocation
trees and DAGs, but not general graphs. These require <code>unsafe</code> in Rust,
even something as supposedly basic as doubly linked lists. It’s against
the borrow checker’s rules, and the compiler will statically prevent
you from making them using safe, borrowing references. They simply aren’t
borrows in the Rust sense, but are rather something else, something
about which Rust doesn’t know how to guarantee safety.</p>
<p>This is not as bad as you might think, because cycles also form a hole in
reference counting, which is a popular run-time GC system. This is why
you can’t use <code>Rc</code> or <code>Arc</code> to implement a doubly-linked list correctly in
Rust either: You’ll get past the borrow checker and guarantee a memory
leak.. These systems generally can’t detect cycles at all, and leak them,
which is arguably worse than forbidding them to be created.</p>
<p>In any case, the <code>unsafe</code> keyword is not poison. For things that Rust
doesn’t know how to keep safe, you need to exercise extra responsibility,
but at least the programming language is making you aware of it –
unlike C++, which is unsafe all the time.</p>
Write Everything Down (Part 3): My Personal Organizational Systemhttps://www.thecodedmessage.com/posts/my-organizational-system/2022-10-06T00:00:00+00:00As promised in my previous posts about organization, I will now go into some detail about my own organizational system. But before I start talking about it, and how I came to develop it, I’d like to emphasize a few points, or more specifically, three caveats, lest Zeus strike me down with a thunderbolt for my hubris:
Caveat the First: My system is a work in progress. Even though it is overall very helpful, it’s always falling apart a little bit.<p>As promised in my <a href="https://www.thecodedmessage.com/posts/write-everything-down/">previous</a>
<a href="https://www.thecodedmessage.com/posts/org2-failed-system/">posts</a> about <a href="https://www.thecodedmessage.com/tags/organization/">organization</a>,
I will now go into some detail about my own organizational system.
But before I start talking about it, and how I came to develop it, I’d
like to emphasize a few points, or more specifically, three caveats,
lest Zeus strike me down with a thunderbolt for my hubris:</p>
<ul>
<li><strong>Caveat the First:</strong> My system is a work in progress. Even though
it is overall very helpful, it’s always falling apart a little bit.
Some parts of it work better than others, and it’s constantly evolving
as I try to shore up the parts that fall apart more easily. Sometimes,
it’s in a better state than others.</li>
<li><strong>Caveat the Second:</strong> What works for me might well not work for you,
dear reader. I reckon you and I have very different brains. Even if
a psychiatrist would categorize me and you with all the same formally
recognized traits, we still have literally different brains, and
literally different histories, cultural backgrounds, and personal
struggles.</li>
<li><strong>Caveat the Third:</strong> Nothing in this system is particularly novel.
It is however very tweaked to my own personality. I present this not
to claim that I’ve developed anything new, but as a worked example of applying
existing practices to my own life, in hopes that it will be useful
to you.</li>
</ul>
<p>And it is indeed a very personal system and a continuously evolving
system. I am sensitive to minor issues. If a TODO list system
is insufficiently ergonomic for me, I’ll get overwhelmed by it,
or intimidated by it, disheartened, blocked out by my personal <a href="https://www.adhdessentials.com/wp-content/uploads/5-Ways-to-Overcome-The-Wall-of-Awful.pdf">“Wall of
Awful”</a>,
and I will default to not using any organizational system at all, and
simply relying on my natural faculties – my naturally poor prospective
memory – to make sure I do the things I need to do.</p>
<p>This has predictably terrible results, so I keep trying to use the
system, but it involves a lot of tweaking, a lot of tricking myself,
occasionally changing up the system in some ways, not necessarily because
the improvements help – though sometimes they do – but also because
changing the system makes it more interesting and more engaging and
gives me another reason to look at the TODO items as I re-sort them
into new categories rather than just idly reading through them.</p>
<p>Some parts of the system also work better than others, and so sometimes
I’ll just use the parts of the system that work the best for me on their
own for a few days, and the other parts of the system, the parts that work
less well, can take a few days off before I am up to using them again.
And that’s OK – that’s part of the design. It’s sort of become part
of the organizational pattern – making a task out of re-organization.</p>
<p>So with these caveats out of the way, let’s begin the tour. Each section
will go over a feature of the organizational system, describe where I
got the idea from, and discuss how it fits into my dynamic.</p>
<h2 id="e-mail-let-me-write-that-down-real-quick">E-Mail: Let me write that down real quick</h2>
<p>TODO tasks can come at any time, in any situation. Whether you
abruptly remember something that you <em>have</em> to do, but didn’t
write down, or someone tries to create a plan with you where
you have to check the calendar, or you get an e-mail you have to
respond to or else lose an important account, TODO tasks are not
things that happen when you have your computer out and are ready
to use your Fully Realized Personal Organizational System (TM).</p>
<p>This is perhaps the most important part of any TODO system: How do
new tasks enter it? For those who use apps with phone versions, this
might be that you actually put the task where it “actually goes” right
away. But in my experience, this is too tedious. The version of the task
when fully considered and put in the appropriate spot is different from
the version you got it in, and converting it, and finding the right spot,
is itself a mini-task you might be tempted to procrastinate. And God
forbid you decide that, really, it should be in a completely new category,
and other things should be moved to that category too!</p>
<p>No, new tasks belong in their own lightweight place, and then a separate
habit should be developed of draining that place, later, when not
trying to make conversation with someone, and organizing those tasks.</p>
<p>This is a core tenet of the <a href="https://gettingthingsdone.com/what-is-gtd/">Getting Things
Done</a> productivity
methodology, and one I happen to agree with – and realized on my
own a long time ago in my battles with JIRA. Recording tasks as
they come up, and organizing tasks so that you can plan to do them,
are fundamentally different things, and require different systems. The
burdens of “organization” should not interfere with the lightweightness
of “collection” – especially when “organization” is as arcane and
heavyweight as JIRA is.</p>
<p>For my “collection” step, I use e-mail, and send myself quick subject-only
e-mails to cover tasks I unexpectedly learn about. I marked all my
e-mail as read in one heroic step late 2021, and ever since then, I’ve
actually been an “Inbox Zero” person. Or rather, the type of person who
regularly reads all my e-mails. This allows me to use unread e-mails as
a repository for tasks, which is good because sometimes tasks from other
people or organizations come in this way. I can mark e-mails unread as
well, if they are both messages and tasks, an advantage text messages
don’t have, so e-mail is better than texting in that way.</p>
<p>And then, there is a habit that comes with this: Every once in a while,
ideally every day but at least every other day, I have to have a little
session where I sit down, go through the e-mails, and either do the things
or copy the TODO items into another place, within my real organizational
system, which is in text files on my computer.</p>
<p>Of course, it also entails aggressively checking my e-mail multiple times
a day, and in the biggest change, actually deleting or marking as read the
“real” e-mails from companies that I’m not going to read, rather than just
letting them pile up. So here I am, in my 30’s, finally a practitioner
of Inbox Zero, which is honestly a bit of a pain. But the ability to
use the e-mail Inbox as a TODO list has, so far, been worth it.</p>
<p>And I also know if I used a dedicated TODO app as my TODO list,
instead of e-mails, I know that I’d (a) let them live there forever and
(b) eventually stop looking at the app. The point is that these items
don’t <em>live</em> in my e-mail – they just stay there as scratch space, for
collection purposes. I get them into my real TODO system as fast as
I can and they get deleted. The real TODO system, where items <em>live</em>,
is much more sophisticated, and that’s what I’ll get into next.</p>
<h2 id="hierarchical-plain-text-files">Hierarchical Plain Text Files</h2>
<p>So, first of all, rather than use any sort of structured app, or
spreadsheets, I use plain text files. I edit them in <code>vim</code>, my
preferred text editor for programming, as I am very used to it.
The commands, such as <code>dd</code> for delete line, <code>p</code> to paste in the
line you just deleted, <code>o</code> for write a line above the current one,
are all in my fingers’ muscle memory. This is the easiest way I can
move content around and reorganize it, from file to file and within
a file.</p>
<p>As I am using a programmers’ text editor to do my organizational
system (and to do my writing), often when I’m working on blogging
or just figuring out what to put on my TODO list, I look like I’m
programming. People come up to me at bars and coffee shops to tell
me that they’re impressed with the programming I’m doing:</p>
<p><img src="https://www.thecodedmessage.com/org-screenshot.png" alt="Desktop Screenshot"></p>
<p>But, as you can see if you read the words on the left, this is my
blogging TODO list, not my work at all.</p>
<p>It’s very important to me that the items are hierarchical. I have
a lot of ideas that flow out of my mind, and I like feeling like I can
write them down so that I can eventually actually follow through on them.
If I wrote all the ideas in list form – ever – there simply would
be too little structure, and I would never find ideas that went together.</p>
<p>This is somewhat obvious for the planning for a blog post: Of course
elements of a blog post go together, under a heading for that blog post,
and of course pre-writing can be done in the form of a hierarchical
outline.</p>
<p>But other tasks are hierarchical as well, even those we normally write
lists for. I find that I do better when I express the hierarchy, and
re-adapt the hierarchy for various phases of the planning.</p>
<p>For example, grocery lists. As a grocery list is being generated, it is
hierarchical by planned meal (or non-meal category like “snacks”),
and I type it out accordingly:</p>
<pre tabindex="0"><code>* Grocery shop
* Snacks
* Chips and salsa
* Hot salsa
* Mild salsa for guests
* Bread and hummus
* Challah bread
* Potato rolls
* Planned meals
* Chili
* Beans (I may have this already)
* Tomatoes
* Onions
* Spices
* Check online recipe
* Confirm that I have them
* Veggie Carbonara
* Shiitake mushrooms
...
...
</code></pre><p>Notice that many of the TODO items are actually little research items
to improve the hierarchy. I have to go check whether I have beans,
and once I do, I can edit it. This same “Planned Meals” section can also
be duplicated from the grocery shopping section to a separate section
that tells me to actually make the meal.</p>
<p>If I do my shopping through a <a href="https://www.freshdirect.com/">delivery system</a>,
the shopping list can stay in this format as I enter it into their app.
But if I need to go shopping in person, I can restructure this shopping
list to be by section of the grocery store, rather than by meal. The
act of restructuring the list also helps me solidify the list in my
memory, and gives me opportunities to realize I’ve missed things:</p>
<pre tabindex="0"><code>* Grocery shopping, buy:
* Produce
* Green onions
* Peppers
* Traffic light (red, yellow, green)
* Mushrooms
* Shiitake
* Cremini
* Canned food
* Beans
* Diced tomatoes
* Hot salsa
* Mild salsa
</code></pre><p>My TODO lists can be very long, as I can see by running the <code>wc -l</code>
command to show how many lines they have:</p>
<pre tabindex="0"><code>[jim@palatinate:~]$ wc -l Log/TASKS Log/CALENDAR Log/WRITING.md Log/TECH_WRITING
353 Log/TASKS
95 Log/CALENDAR
1022 Log/WRITING.md
719 Log/TECH_WRITING
2189 total
</code></pre><p>If I didn’t use some level of hierarchicalization, they would be completely
impossible to read.</p>
<p>I am, of course, not the first person to use hierarchical plain text
files as an organizational system. I use <code>vim</code> as my text editor,
but in the <a href="https://www.emacswiki.org/emacs/ChurchOfEmacs">Church of <code>emacs</code></a>,
where <code>emacs</code> is considered the one true text editor, there
is a long tradition of using <a href="https://orgmode.org/">Org Mode</a>,
a special file format that <code>emacs</code> specifically supports for
such hierarchial organizational text. I have had colleagues use
Org Mode in the past, and it was definitely an inspiration for my
current system.</p>
<p>My files are not valid org mode files – they use the markdown style
of hierarchical bullet points – but I sometimes consider making them
org mode files, as that format is supported by
<a href="https://plainorg.com/">multiple</a>
<a href="https://apps.apple.com/us/app/mobileorg/id634225528">iPhone</a>
<a href="https://beorgapp.com/">apps</a> and then I could use my phone to access
and update my organization files.</p>
<p>I also use <a href="https://www.dropbox.com/">Dropbox</a> to keep these files
synced between computers, because I am unfortunately no stranger to
losing my laptop – because I forget to take my bag back with me when
I’ve left it in a place.</p>
<h1 id="task-specific-file-formats">Task-Specific File Formats</h1>
<p>But the fact that I use hierarchical plain text files is not enough to
constitute a system. It’s more information than just “write it down”;
it’s “write it down in plain text in a specific style of bullet points
in a known location on my laptop and in Dropbox.” We’re closer to a
system, but we’re not the whole way there yet.</p>
<p>This reminds me of the <a href="https://en.wikipedia.org/wiki/OSI_model">OSI model of networking
layers</a>. Indulge my nerdiness
for a second: computer networking is systematized in multiple layers. The
difference between Ethernet, Wi-Fi, and cable is one layer. At another
more abstract layer, it’s all “the Internet,” using a family of protocols
called TCP/IP. At a still higher layer, there’s a different “protocols”
between browsing the web, your WhatsApp messages, your Zoom call, and
each of them is layered on top of “Internet” or TCP/IP, and each of
them is layered on top of either Wi-Fi, Ethernet, Cable, 5G, or <a href="https://en.wikipedia.org/wiki/IP_over_Avian_Carriers">carrier
pigeon</a>.</p>
<p>A simpler analogy: On your phone, you have different apps. It’s the
same phone with the same hardware, but the apps are different.</p>
<p>Similarly, we have defined a few layers of my system:</p>
<ul>
<li>English</li>
<li>Writing</li>
<li>On my laptop and in DropBox</li>
<li>Plain text files
<ul>
<li>With hierarchies of bullet points
<ul>
<li>To organize information
<ul>
<li>Inspired by “<code>org</code>” mode</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
<p>But in order to actually use the system to organize my life, I am now
at the point where I need different details for different parts of my
life – where I have to do different things in my different apps.
What works for one part of my life might not work in others.</p>
<h2 id="calendar-what-to-do-each-day">Calendar: What to do each day</h2>
<p>I’ll talk about my calendar first. I use it absolutely every day, and
it is the part of my organizational system that I absolutely couldn’t
function without. My use of this organizational system waxes and wanes,
as do my general organizational skills, but the presence of this system
definitely elevates the waxing and waning so that my lowest lows are still
more functional than my normal days before I had it. The calendar provides
this baseline level, and is one of the parts that still functions when I’m
too busy or too exhausted or simply too frazzled to use the other parts.</p>
<p>I won’t post screenshots of my calendar because it’s, ahem, obviously
<em>very private</em>, but my main calendar is not Google Calendar on my phone
(which I do not use), nor a calendar on the wall (which my parents always
used to great effect), but it is actually one of these hierarchical
text files.</p>
<p>If you send me a Google calendar invite, it won’t actually happen
unless I integrate it into my personal calendar.</p>
<p>Fine, here’s a screenshot, but it’s scrolled down because I plan <em>very</em>
far in advance (wink):</p>
<p><img src="https://www.thecodedmessage.com/calendar-screenshot.png" alt="Calendar Screenshot"></p>
<p>Each day has its own little entry. They can be as short or as long as
they need, though if they get too long, I take this as a sign that
I’m expecting too much of myself for that day. The actual items are
a bit of a hodgepodge: It includes absolutely mandatory appointments
(with times), things I must get done that day or else suffer greatly, or
just tasks I know I need to do at some point and figure this is a good
day for them. Some of it’s super time sensitive, but a lot of it is a
near-term TODO list straddled along a few days of calendar, or moved from
day to day as I literally procrastinate (<em>pro cras</em> being Latin for “for
tomorrow”). For some items, the day it’s listed under doesn’t really mean
much of anything at all, besides a promise to myself to think about that
task on that day, and also an anxiety-relieving reassurance that I don’t
<em>have</em> to think about it before then, because it’s written on a later day,
which is the designated time for that obligation to resurface in my life,
and no sooner.</p>
<p>(As a result, if an event needs preparation, I often need to write the
preparation separately from the event, as when I’m in my less-functional
modes I might not look <em>ahead</em> on the calendar.)</p>
<p>I don’t really distinguish these things in the calendar file; I keep
track of what’s urgent and what’s not in my head. In spite of my sometimes
over-enthusiastic over-wrought hierarchical notes, I actually only need
a little reminder to not forget the thing at all. Once I’m reminded,
my more normal-functioning retrospective memory kicks in, and I know
all the details of why I have to do the thing and how
<a href="https://www.youtube.com/watch?v=jxiCV0vwVT8">urgent and/or important</a>
it actually is.</p>
<p>Like all parts of this organizational system, the calendar comes with
some habits, which form the core of any organizational system. For the
calendar, the habit is that every day, before I actually go about the
tasks of my day, I look at the calendar to see what all I have to do
that day. Often, I have leftover material from the previous day or days,
from what I did (or at least, what I was supposed to do) yesterday.</p>
<p>I take those bullet points from previous days and respond appropriately.
If it’s something I did, I’ll remove it. If it’s something I didn’t do,
and should do, I move it to another day. And if it’s something I didn’t
do, and it’s too late to do now, I can schedule apologies and other damage
control for today or another day.</p>
<p>Perhaps just as important but less likely to actually happen is looking
the previous night about what I have to do the next day. This ensures
I actually get up on time to do things that are in the morning. Fortunately,
such things are normally social in nature and my excitement for a social
opportunity helps glue it in my memory – sometimes.</p>
<p>The other habit that comes with the calendar file is for when things get
added to it. I don’t commit to anything until it’s hit my calendar and
written the thing into the calendar. If I know I’m free, I might commit
after having written myself an e-mail to add something to my calendar,
but even then, I write the e-mail while I’m still having the conversation,
and don’t actually agree to do the thing until I’ve sent it. This goes
for work events and doctors appointments as well as personal events.</p>
<h2 id="the-work-file">The Work File</h2>
<p>Though the calendar will contain some amount of day-to-day life TODOs,
it generally doesn’t include work. If it’s a weekday, I start work,
but I don’t write “get a day’s worth of job work done” on every entry
(which is probably how I’d phrase it if I were to). I’ll put specific work
meetings on it, because otherwise I might miss them, but my work is not
heavy on concrete deadlines, and to the extent that it is, that goes in
a separate work organizational system, in its own hierarchical text file.</p>
<p>I keep my organizational files open in GVim windows – panes, really,
since I use XMonad – on virtual desktop #5 on my laptop. I generally
have at least one organizational file open when I’m using the computer,
depending on what I’m doing, and sometimes more than one. If I’m doing
something else on another screen, writing code or reading Wikipedia,
I always have my organizational files available at a moment’s tap
of the buttons.</p>
<p>My work organizational file is open if and only if I’m working. It’s
state of being open defines the concept of “being at work,” and
reifies it in my mind. One hierarchical file contains all my
personal work organization.</p>
<p>“All my <em>personal</em> work organization” doesn’t necessarily mean all my work
organization overall: Ticketing systems and documentation and project
plans that might have to be shared with others live in separate places,
in formats preferred by the teams and companies I work for. But I always
keep track of where they are (so long as they are relevant and so long as
I have to make sure not to forget about them entirely) in the work file
itself. If it doesn’t exist in the work file, I might forget about it,
and likely will. This implies that if I shouldn’t forget about something,
it should be referenced in the work file.</p>
<p>Therefore, one thing that is in my work file, and goes near the top,
is a series of links to shared organizational pages. What tickets am I
currently working on and responsible for updating, even if the actual work
is done and I just have to answer questions or follow up on QA (quality
assurance) at this point? What code have I written in the form of a <a href="https://en.wikipedia.org/wiki/Distributed_version_control#Pull_requests">merge
request</a>
that I need to make sure my colleagues review and actually
integrate with the code, even if I’ve written all the code I have
to and I’ve theoretically handed it off? What issues have I filed on
<a href="https://www.github.com/">GitHub</a> with open source projects that I need
to follow up on to see if their maintainers have gotten back to me?
Programming requires a surprising amount of following up with other
people and reminding them of things, and that requires a list of things
that are pending so I don’t forget a whole thing to follow up on.</p>
<p>Also at the top of the file I include my current TODO list. Oftentimes,
I get a torrent of tasks at once, like when I think I have something
simple to do (like fill out a JIRA ticket) but it turns out to have
several side-quests (like researching which versions are supposed to
support which features and who maintains them, and creating a secondary
JIRA ticket to make sure the code eventually gets QA’d). Instead of trying
to rely on my prospective memory, I spill them into the “immediate TODO”
location. If they are not in fact the things I should do <em>next</em>, or
if I find them overwhelming, I then sort them properly into projects.</p>
<p>Projects are the majority of the file, each in its own little cluster,
analagous to dates in the calendar file. Usually, at any given point
for my job, I have 1-3 projects that I’m actively working on. I prefer
having at least one major project and one minor project: I work well when
<a href="https://www.thecodedmessage.com/posts/crank-em-out/">switching between multiple projects</a> and I can
use the minor project to take a break from the major project.</p>
<p>But my work file always has more than 1-3 projects. Often, I know what
projects I will have to do after I’m done with the current project.
Sometimes, I have started a project, and it might or might not come
up again – and I have my notes for it in case it does. Sometimes,
I have an idea for a future project, but it’s not really relevant now,
and I write up how to do it. In any case, they’re all ready to consider
when I’ve finished my current project, or to thaw out if someone tells
me they’ve increased in priority or urgency.</p>
<p>The TODO/project dichotomy might seem a little complicated, but it’s
necessary. Sometimes, the TODO items involve writing project summaries
for a project. Sometimes, it includes communicating with other people
about the project. Usually, the project summary itself contains the
actual steps needed to get the project done as a programming project,
whereas the TODO includes things like communicating to other people’s
day-to-day messages about them.</p>
<p>And a programming job can easily gear up to that level of complexity.
I get tasks on a 1-2 week time scale that I then have to break down into
subtasks myself, sometimes even on a timescale of months. Only a small
part of the job is actually changing the code. Most of it is figuring
out how the existing code worked, figuring out why that doesn’t meet
the needs, and then making sure the modified code actually gets into
the finished product. People who ask for help often are stuck because
they’ve made a false assumption, or misunderstood what the problem even
is. And often problems straddle many unrelated systems – code testing
and deployment almost always does.</p>
<p>So here are the habits that come with the work file: This means that
whenever I find myself at work and not sure what I’m doing, I go read the
top TODO item. If there is none, I go grab some items from the current
project. If I do know what to do, but it isn’t written down in either of
those two places, well, I write it down before I do it, in case I get
distracted and forget. Meanwhile, meetings and time-based TODO items,
as I said before, go in the same calendar along with everything else.</p>
<h2 id="other-files">Other Files</h2>
<p>I have other hierarchical files in my system. Besides my work file
for my job, I also have files in a similar structure for programming
as a hobby (including technical blog posts), writing as a hobby
(including non-technical blog posts), finances, and my social life.
But for the most part, they just follow the same structure:
a section for each project, in hierarchical format, with an immediate
TODO list at the top.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This system overall works well for me, but I’m always ready
for improvements. Please share in the comments things that
work well for you, and maybe I can learn something!</p>
Write Everything Down (Part 2): Failed Organizational Systemshttps://www.thecodedmessage.com/posts/org2-failed-system/2022-10-05T00:00:00+00:00In my previous post on organization, I concluded with this statement:
As everyone’s brain works differently (whether ADHD or not), people differ tremendously in what their ideal organizational systems are. For me, I am much less productive if I have a less than ideal system – the stakes are very high. But even for people who can be productive on any system, I think that tailoring their system to their brain, their lifestyle, their job and schedule and hobbies, can have amazing results.<p>In my <a href="https://www.thecodedmessage.com/posts/write-everything-down/">previous post</a> on
<a href="https://www.thecodedmessage.com/tags/organization/">organization</a>, I concluded with this statement:</p>
<blockquote>
<p>As everyone’s brain works differently (whether ADHD or not), people differ
tremendously in what their ideal organizational systems are. For me,
I am much less productive if I have a less than ideal system – the
stakes are very high. But even for people who can be productive on any
system, I think that tailoring their system to their brain, their lifestyle,
their job and schedule and hobbies, can have amazing results.</p>
</blockquote>
<p>In this post, I want to go more into detail about that. Specifically,
I’d like to demonstrate the point by looking at organizational systems
and techniques that have <em>not</em> worked for me personally, in approximate
chronological order of my life.</p>
<h2 id="handwriting">Handwriting</h2>
<p>The first one is the one I noticed the earliest, before anyone expected
any organization out of me: I am not very good at handwriting. I don’t
have the best coordination, it makes my hand hurt, and I can’t really
get myself to do it in a sustained way.</p>
<p>I have a number of reasons or excuses for this, but the biggest one
is probably how slow it is. I get impatient. I get distracted by the
effort, and I forget what I was going to write next. Even if I don’t,
I get frustrated with the lack of speed: I can type at 110 WPM or so,
whereas my handwriting is probably more like 15.</p>
<p>I basically don’t use handwriting at all in my present life – which
means I’m unpracticed and makes it even less of an appealing option.
Many people recommend handwritten TODO lists and reminders as a way
of organizing, or handwritten journals as a way of meditating and
logging life, and I’ve come to realize that as appealing as it may
sound, and as relaxing as it can be to not be at a screen, these
techniques are not for me – but can be for me if modified to
be on a computer (or phone).</p>
<h2 id="homework-and-note-taking">Homework and Note-Taking</h2>
<p>My handwriting problems led soon to note-taking problems. Note-taking was
a bust for me in school, and even in university.</p>
<p>I have heard that writing notes is supposed to help you pay
attention. Maybe this works for people who don’t struggle as much with
handwriting as I do – I do have some smatterings of experiences doing
better note-taking with a computer. However, for whatever reasons, in
school I simply could not pay attention to the teacher and write notes
at the same time, or honestly get myself to write any notes even if
I was not simultaneously trying to pay attention to the teacher. When
the teacher was finished teaching, or took a break, I never took that
opportunity to write anything down either, because that would get me
maybe 10% of the actual notes that I could theoretically get out of the
class, and that wasn’t enough to actually be useful.</p>
<p>This, perhaps surprisingly, didn’t really cause me problems for tests. I
was usually actually paying really good attention to the teacher (whether
the teacher could tell from my body language or not), and as I said
before, I had a really good retrospective memory. Generally, as long
as I remembered and understood everything the teacher had explained,
I would do well on the tests, which almost always, in practice, tested
retrospective memory.</p>
<p>See, tests always remind you what the question is about, and very
rarely contain a gotcha where you have to remember <em>to</em> do something.
Generally, even in math tests, you can’t leave out a step, because
otherwise you simply don’t arrive at the answer. When leaving out a
step was possible (e.g. doing a separate “units check” after answering
the question), I often would forget to do it. But in such situations,
there was at least usually a reminder in the text of the exam.
All in all, school tests privilege retrospective memory over
prospective memory, biasing them towards people who think like me,
and (especially due to my deficits in prospective memory) giving
an inflated view of my skill-sets.</p>
<p>But even though this didn’t affect my test-taking, this complete lack of
note-taking had some other negative effects. I remember in middle school
you used to have to write your homework down in a little planner they
gave you called an “agenda book,” which also contained hall passes in
the back for teachers to initial. I never remembered to write down the
homework. Basically every day I had to call a very tedious hotline (from
the landline phone in my parents’ bedroom), a hotline on which teachers
usually (but not completely reliably) used to make voice recordings of
their daily homework. It was called “info connect,” and it always played
ads for the local bank, a service offered to all students and very helpful
for my ADHD – a good example of a universal accommodation and one that
I desperately needed because that homework was not getting written down,
even if it was so tedious.</p>
<p>By the time I got to university, homework was usually listed as part
of a syllabus you got at the beginning of class, or else posted on the
course website, perhaps both. It’s sort of implied university will be
harder than middle school, but in this particular way, it wasn’t. The
syllabus is great organizational technology – it was way easier than
calling a hotline and listening for your homework. All you had to do
was look at the syllabus, which you would have saved from your first day
of class in a known location in your room… or who am I
kidding? It was available for regular consultation on the course website.</p>
<p>There have been a few other occasions over the years where note-taking
was essential, and they made me nervous. In our high school debate team,
note-taking was necessary because you were judged on whether you’d replied
to <em>everything</em> your opponent said. Leaving something out rendered you
vulnerable to your opponent reiterating it and claiming that you dropped
it because you had no counterargument. Therefore, it was actually more
prospective rather than retrospective memory, and my natural memory,
which had covered for my lack of note-taking ability in so many classes,
was not able to help me as much.</p>
<p>However, though I was nervous about it, I was ultimately able to do fine
in Debate, because note-taking actually was the primary activity. I did
not have to pay full attention to the speaker, as I often automatically
did in class, because I already understood most of the relevant concepts
to the topic and didn’t need to pay full attention to the concepts, just
which ones they invoked in what structure. Furthermore, the requirements
of the note taking was minimal: It was more an outline of my rebuttal
than an actual record of what was said. All in all, debating was like
being a stereotypical bad listener: You’re barely paying attention,
focusing the entire time on what you’re going to say next.</p>
<p>But mostly, note-taking was problematic for me when prospective memory
was called for, and I was expected to amplify it with writing. I
would more often fail than succeed in that situation. When retrospective
memory was called for and other students amplified it with writing, I
simply defaulted to my non-amplified skills, and that was enough.</p>
<p>Interestingly enough, the same problem doesn’t really apply to me often
today. Nowadays, it’s actually easier to extract TODO items out of a
meeting than it was when I was in high school or in college. For
one thing, I basically always get to have my computer with me now
to type up notes – a non-starter in high school, and commonly forbidden
in college classes for disrupting the class. For another thing, though,
if a meeting creates TODO items for me, I can write them down towards
the end, while simultaneously clarifying – out loud! – what exactly the
tasks are.</p>
<p>Often, these tasks are the result of a convoluted discussion, and so
people appreciate taking the time to hear me summarize my take-aways
out-loud, and it gives everybody an opportunity to sanity-check whatever
plan we’ve come up with. Meanwhile, I can talk while I write, and write
it directly into my work TODO list (to be re-triaged afterwords into
more finely-grained tasks, as I’ll discuss later).</p>
<h2 id="work-ticketing-systems">Work Ticketing Systems</h2>
<p>Sometimes, the programming jobs I’ve had have required me to use
ticketing systems as a job requirement. These ticketing systems
are often both organizational systems and collaboration systems,
and while I have to use them as collaboration systems, they don’t
do much for me as organizational systems – in fact, often, they
increase my organizational burden rather than decreasing it.</p>
<p>Take JIRA, for example. JIRA is a system for tracking work tasks.
I’ve used it at several different jobs, and when I worked at a
software consultancy, I’ve used it interacting with several
different clients.</p>
<p>With JIRA, your work is structured into tickets, which move across
a board based on their level of completion. And when those tickets
are well-specified and manageable in size, if you’re consuming
these tickets, it can actually be quite nice. And because you often need to
ask other people to do things, or because there’s often a lot of work
to do on a team, but where anyone on the team can do it, some sort
of collaborative system is absolutely necessary.</p>
<p>But creating a JIRA ticket takes a lot of work. There’s no way to
write a note and postpone to a later meeting or later time how
to turn it into a fully-fledged ticket. Often, creating a ticket
requires answering a lot of questions, mandatory questions, like
what version does it apply for, or similar things – which if
you’re making a ticket for another team to work on, or for
a new project, or just for a new kind of problem, you might
not even know the answers to without asking your colleagues.
If you have to create multiple tickets, or side-track yourself
from your work to create a ticket, it’s basically impossible to
keep track of it all without writing a list of tickets to create,
as creating a ticket can take a very long time.</p>
<p>Not even to mention the fact that to write a JIRA ticket, or look
at JIRA tickets, I need to de-immerse myself from the land of command
lines and text editors, and return to my web browser screen – which is
intrinsically more distracting, even if I’m only doing work things with
my web browser like following up Slack messages or work e-mails.</p>
<p>So even though it’s tempting to use JIRA directly as an organizational
system, especially as that’s how it’s seemingly designed to be used,
I can’t. I have to keep my own TODO lists, and when collaboration
renders it necessary to make a ticket so someone else on the team
can work on it, or so managers have insight into my work, my own
TODO list has to contain the tickets.</p>
<p>Furthermore, as I’ve learned, and will go into detail about later,
doing my job well requires breaking down tickets far more than JIRA
will normally encourage you to do.</p>
<p>And so, in both directions, I look at JIRA more as a communication tool
than as a tool for organizing my own work, and from my perspective,
it’s actually one more burden, one more thing to organize.</p>
<p>Now, I think there are other ticketing systems that work better,
whether in being able to be used directly (in some ways) as an organizational
system, or at least in being a more efficient and effective communication
system that’s not as much of a burden. But that, I think, is a topic
for a potential future post specifically on programming ticketing
systems.</p>
<p>In any case, my personal organizational needs are unique enough that
I would always have my own system running parallel to it, even if a
ticketing system were better than JIRA. I’m sure, however, there’s
someone out there using JIRA for their personal life, and I wish them
all the best.</p>
<h2 id="todo-apps">TODO Apps</h2>
<p>For a while, I used <a href="https://www.rememberthemilk.com/app/#list/46966204">Remember The
Milk</a> as an app. I
ultimately ended up not continuing because it felt too inflexible to
reorganize. My lists simply got too long, and ended up being intimidating,
and I ended up not looking at them again.</p>
<p>To be clear, this happens to <em>all</em> my TODO lists: They get longer, I spend
some time not removing things from them, I have bursts of many ideas for
what to put on them, and eventually they become too long to even dare to
look at, as anxiety expands and explodes in my brain. The difference is,
if I can take a few items off of the TODO list sometimes, and put them on
another list, from which I can’t see the larger list, of things to do on
a per-day basis, then I’m much calmer and happier. This is a very specific
set of requirements, and most TODO apps don’t work exactly that way.</p>
<p>Even if they did, I always am changing up how my TODO lists are
structured, and how they relate to each other. Apps are by
nature opinionated about such things. Most of them don’t have
support for hierarchies – whereas my TODO lists often
are bullet points within bullet points within bullet points,
a tree-shaped outline of the task rather than a literal flat list.</p>
<p>And even if it’s possible to move things around between lists freely
enough to impose new structures, and to protect myself from the long lists
I don’t always want to see, TODO apps are still not the most natural
interface for me. Moving things around has to be easy for me, and in an app
normally there’s simply too many steps to it, especially too many clicks
of the mouse. I’m used to doing things in a more keyboard-driven fashion
rather than a mouse-driven fashion, and I’m used to using a traditional
computer interface over a phone or web interface. This makes me a weirdo,
but it also makes most TODO apps a poor match for me.</p>
<h2 id="my-answer-developing-my-own-system">My Answer: Developing my own system</h2>
<p>So I’ve developed my very own, very bespoke, very complicated system.
I’m extremely happy with it, but it’s for me, not for you, so I’m not
going to share it.</p>
<p>Just kidding, I’m going to explain it in the next post! But I’ll
warn you ahead of time, it might not work for you. It might work
just as poorly for you as keeping a hand-written planner is for me,
and a hand-written planner might work perfectly for you. But hopefully
my experience will give you insight into how brains work and how
they differ, and help you understand the diversity of what makes
different people tick.</p>
A Strong Typing Examplehttps://www.thecodedmessage.com/posts/strong-typing/2022-09-15T00:00:00+00:00I’m a Rust programmer and in general a fan of strong typing over dynamic or duck typing. But a lot of advocacy for strong typing doesn’t actually give examples of the bugs it can prevent, or it gives overly simplistic examples that don’t really ring true to actual experience.
Today, I have a longer-form example of where static typing can help prevent bugs before they happen.
The Problem Imagine you have a process that receives messages and must respond to them.<p>I’m a Rust programmer and in general a fan of strong typing over dynamic
or duck typing. But a lot of advocacy for strong typing doesn’t actually
give examples of the bugs it can prevent, or it gives overly simplistic
examples that don’t really ring true to actual experience.</p>
<p>Today, I have a longer-form example of where static typing can help
prevent bugs before they happen.</p>
<h1 id="the-problem">The Problem</h1>
<p>Imagine you have a process that receives messages and must respond
to them. In fact, imagine you have potentially many such processes,
and want to write a framework to handle it.</p>
<p>The incoming messages are expected to be in JSON, and the responses
are also supposed to be in JSON. So your framework parses the incoming
messages from JSON before passing it to the application’s callback
function, and then serializes the results.</p>
<p>In Rust, the interface for the callback would look something like this
(<a href="https://docs.rs/serde_json/latest/serde_json/enum.Value.html"><code>Value</code></a>
is a parsed JSON type from
<a href="https://docs.rs/serde_json/latest/serde_json/"><code>serde_json</code></a>:</p>
<pre tabindex="0"><code>trait MessageHandler {
fn handle_message(&self, input: Value) -> Value;
}
</code></pre><p>In a dynamically-typed language like Python, the callback function would
look more like this:</p>
<pre tabindex="0"><code>def handle_message(self, input):
</code></pre><p>The code in the callback would then (hopefully) validate the JSON to
make sure it meets the expect schema, and if it’s not, return some error
in the reply message. In a programming language like Python (I make
no promises that my Python is idiomatic or accurate; it’s meant as an
example of a duck-typing language), it perhaps could be written like this:</p>
<pre tabindex="0"><code>if not self.is_valid_input(input):
return {"error": "Invalid input", "input": input}
</code></pre><p>If the JSON is in a valid format, it would do some processing and
return a non-error result.</p>
<p>The framework code, in order to do this, runs code that looks
something like this (in pseudo-Python):</p>
<pre tabindex="0"><code>input = conn.recv_message()
input = json_parse(input)
output = handler.handle_message(input)
output = json_serialize(output)
conn.send_response(output)
</code></pre><p>And all of this will work just fine.</p>
<p>Except… what if the input isn’t valid JSON? And what if none
of our test cases considered this possibility, but it nevertheless
arises in production? What if we didn’t even write test cases?</p>
<h1 id="some-attempts-to-solve">Some Attempts to Solve</h1>
<h2 id="making-sure-we-catch-the-error-at-all">Making sure we catch the error at all</h2>
<p>In Rust, we would already have a hint that there’s something wrong. JSON
parsing in Rust is a function that can fail, and that is reflected
in the type of the function to parse JSON, which looks something like
this:</p>
<pre tabindex="0"><code>pub fn from_slice(v: &[u8]) -> Result<Value>
</code></pre><p>The <code>Result</code> means that this function can fail. We have to handle that
failure in some way before we can get the resultant type. We can
crash the whole program:</p>
<pre tabindex="0"><code>let input = from_slice(&input).expect("Invalid JSON");
</code></pre><p>NB: Reusing the name <code>input</code> like this with a different type is allowed
in Rust; this declares a new variable that shadows the old one. This is
idiomatic when the value is being transformed and we don’t need the old
form anymore.</p>
<p>Or we can do what Python will likely do by default, and bubble the
error up to the caller of the current function:</p>
<pre tabindex="0"><code>let input = from_slice(&input)?;
</code></pre><p>Or we can handle the error. And in this case, we should handle the error
in some way, as we need to reply to the message whether it’s in JSON
or not, and so we don’t want to skip over the code that does the reply.</p>
<p>Already, Rust’s typing discipline is helping us. In order to do what
Python does by default, we need to at least opt in with a <code>?</code>. Admittedly,
the programmer may do that on autopilot, but it at least gives the
programmer a hint that there might be an issue worth spending a second
or two considering before moving onwards.</p>
<h2 id="what-to-do-with-the-error">What to do with the error?</h2>
<p>But let’s assume that the programmer did, in fact, realize that
these errors need to be handled. What should we do in case of
an error?</p>
<p>One possibility is to handle it completely in the framework. If we
know all inputs must be valid JSON, we can take this burden off of
the application code:</p>
<pre tabindex="0"><code>try:
output = json_parse(input)
except JsonError:
output = {"error": "Invalid JSON"}
</code></pre><p>But what if we want to give the application-writer more flexibility?
What if we envision a situation where the application-writer wants to
accept either JSON or non-JSON data?</p>
<p>In a duck-typed programming language like Python, if the parsing fails,
we can simply pass the original input to the handler. This is really easy to do.</p>
<pre tabindex="0"><code>try:
input = json_parse(input)
except JsonError:
pass
</code></pre><p>Now, the handler function just needs to ensure that the passed-in
value is a dictionary in our validation:</p>
<pre tabindex="0"><code>def is_valid_input(input):
if type(input) is not dict:
return False
if 'requiredField' not in input:
return False
return True
</code></pre><p>Of course, we might forget to do that, and if we do, we might now
throw an exception when we run the <code>not in</code> test, which throws an exception
if <code>input</code> is not in fact a dictionary. This would be bad, as not even
all JSON parses to dictionaries, but it’s a mistake someone could make
if they’re not thinking about error handling.</p>
<p>In Rust, we can’t pass the initial input directly to the handler, as
it would be a different type. So if we try to do the direct equivalent to the
Python, it gives us an error:</p>
<pre tabindex="0"><code>let input = match from_slice(&input) {
Ok(parsed_value) => parsed_value, // This is the parsed value, type `Value`
Err(_) => input, // This is the raw `Vec<u8>` data... TYPE MISMATCH!
}
</code></pre><p>We are then forced to brainstorm another solution, which might
raise ideas we didn’t otherwise consider, and force us to backtrack
in our design a little, which is actually a good thing because
this solution, while simple in Python, has some flaws.</p>
<p>Here’s some solutions we might brainstorm:</p>
<ul>
<li>Call a different callback in handler for unparsed data
<ul>
<li>Application specifies whether data should be parsed</li>
<li>Framework chooses which callback to call dynamically</li>
</ul>
</li>
<li>Use an <code>enum</code></li>
</ul>
<p>That last one is interesting. If we do want to create a value that
can contain either <code>Value</code> or <code>Vec<u8></code>, we still can in Rust. We just
have to create a new type that tells the compiler we want that:</p>
<pre tabindex="0"><code>enum IncomingMessage {
Parsed(Value),
Unparsed(Vec<u8>),
}
</code></pre><p>Then, before we can do any work on the wrapped <code>Value</code>, we have to say
what happens if it’s actually a <code>Vec<u8></code>:</p>
<pre tabindex="0"><code>let input = match input {
Parsed(value) => value,
Unparsed(_) => {
// return an error JSON blob
}
}
</code></pre><p>In fact, this even helps with the fact that not all parsed JSON is a
dictionary, as <code>serde_json::Value</code> is itself an <code>enum</code>!</p>
<h1 id="further-problem">Further Problem</h1>
<p>But even if we do correctly validate that we have a dictionary, and we
output an error in our message response if we don’t, I want to point
back to our original pseudo-Python for what error to output:</p>
<pre tabindex="0"><code>if not self.is_valid_input(input):
return {"error": "Invalid input", "input": input}
</code></pre><p>If <code>input</code> is JSON parsed into a dictionary, it will definitely serialize
back into JSON, and this line makes sense. But now that <code>input</code> might
not be parsed JSON, but instead might be in some sort of raw format,
this dictionary might fail to serialize back into JSON.</p>
<h1 id="conclusion">Conclusion</h1>
<p>A lot of programming is converting data from one format to another
and validating it. Strong static typing systems like Rust’s can help
prevent mistakes before they happen, and force people to come up
with more rigorous designs rather than shoe-horning different
values into the same variable, which dynamic typing makes easy – too
easy. I hope this example was relatable!</p>
Exploring Traits with Erased 'serde'https://www.thecodedmessage.com/posts/erased-serde/2022-08-13T00:00:00+00:00I came across a programming problem recently where I wanted to use dynamic polymorphism with serde. This turned out to be much easier than I expected, and I thought it was an interesting enough case study to share, especially for people who are learning Rust.
A Brief Discussion of Polymorphism in Rust As most of you will know, Rust’s system for polymorphism – traits – supports both static and dynamic polymorphism, with a bias towards static polymorphism.<p>I came across a programming problem recently where I wanted to use
dynamic polymorphism with <code>serde</code>. This turned out to be much easier
than I expected, and I thought it was an interesting enough case
study to share, especially for people who are learning Rust.</p>
<h1 id="a-brief-discussion-of-polymorphism-in-rust">A Brief Discussion of Polymorphism in Rust</h1>
<p>As most of you will know, Rust’s system for polymorphism – <code>trait</code>s
– supports both static and dynamic polymorphism, with a bias towards
static polymorphism.</p>
<p>For static polymorphism, it uses the <code>impl</code> keyword, or
alternatively, a syntax called “trait bounds” reminiscent of
C++. It is implemented through “monomorphization,” which
involves making on-demand copies of any polymorphic functions
at compile-time. And it is the default way to use polymorphism
in idiomatic Rust, as evidenced by the fact that it comes <a href="https://doc.rust-lang.org/book/ch10-02-traits.html">earlier
in the Rust book</a>.</p>
<p>Dynamic polymorphism, in contrast, uses the <code>dyn</code> keyword to create
“trait objects.” This is implemented through <em>vtables</em>, which are also
how C++ implements OOP-style polymorphism. Even though it is more
of an OOP-style feature, and therefore more familiar to programmers
with an OOP background, in Rust it is less commonly used. This
is evidenced by the fact that it is introduced <a href="https://doc.rust-lang.org/book/ch17-02-trait-objects.html">later in the Rust
book</a>
with a much narrower use case in
a <a href="https://doc.rust-lang.org/book/ch17-00-oop.html">chapter</a> that
encourages a programmer to “implement[] a solution using some of Rust’s
strengths instead.”</p>
<p>The biggest reason dynamic polymorphism is not one of “Rust’s
strenghts” is that only <em>object-safe</em> traits can be used with dynamic
polymorphism, due to the technical limitations of <em>vtables</em>. Whether a
trait is “object traits” is defined by whether it meets a <a href="https://doc.rust-lang.org/reference/items/traits.html#object-safety">long list of
criteria</a>,
which generally get more liberal over time as people agree on how
to address technical limitations, but fundamentally only some traits
<em>can</em> be used with vtables. Additionally, dynamic polymorphism also
adds a performance cost, due to indirect calls and less optimization
opportunities.</p>
<p>The biggest reason to use dynamic polymorphism in spite of these issues
is when an “object” needs to take on a range of possible values at
run-time that can’t be expressed in an <code>enum</code>, because other code has
to be able to expand the list. As the Rust book points out, this comes
up especially often in GUI programming, where the GUI framework has no
way to enumerate every possible widget and know how to <code>draw</code> it or how
it should handle events.</p>
<h1 id="my-situation">My Situation</h1>
<p>I’m not currently a GUI programmer and I rarely use dynamic
polymorphism. My recent experience before Rust was with Haskell and C++
template programming, and both of those are more similar in style to
Rust’s static polymorphism.</p>
<p>But it still occasionally comes up.</p>
<h2 id="step-0-a-normal-serde-use-case">Step 0: A Normal <code>serde</code> Use Case</h2>
<p>So here was the situation: I had a data structure that I was
serializing into JSON so I could send the JSON over TCP. For the sake
of the blog post, let’s pretend I was sending reports on groceries
as an extremely contrived example:</p>
<pre tabindex="0"><code>pub enum MeatStatus {
Veg,
Fish,
Meat,
}
pub struct CustomerId(pub u64);
pub struct GroceryItem {
pub description: String,
pub customer_id: CustomerId,
pub price_in_cents: u64,
pub calories: f64,
pub grams_protein: f64,
pub grams_carbs: f64,
pub grams_fat: f64,
pub grams_alcohol: f64,
pub meat_status: MeatStatus,
pub halal: bool,
pub kosher: bool,
}
</code></pre><p>Now, I not only wanted to send this data out on the wire, but I also
wanted to aggregate it. How many calories was each customer buying, total?
How many customers were vegetarian, pescetarian, or religiously observant?</p>
<p>So I needed to pass this data structure around once I got it from
the cash register (thank you for bearing with this silly example),
and then after extracting some data from it, send it over the wire.</p>
<p>Well, Rust makes this sort of thing easy: “There’s a crate for that.”
In this case, it’s <a href="https://serde.rs/"><code>serde</code></a>, which lets you annotate
data structures for serialization into JSON and other formats. A simple
call to a <code>derive</code> macro makes it implement the <code>serde</code> <a href="https://docs.rs/serde/latest/serde/ser/trait.Serialize.html"><code>Serialize</code>
trait</a>:</p>
<pre tabindex="0"><code>#[derive(Serialize)]
pub enum MeatStatus
...
#[derive(Serialize)]
pub struct CustomerId(pub u64);
#[derive(Serialize)]
pub struct GroceryItem {
...
</code></pre><p>So far, very easy and boring (though we should probably take more time
to appreciate just how amazing <code>serde</code> is, which I will someday write
more about in a dedicated blog post).</p>
<p>I then collect the data from the cash register with a function that looks
like this, as the cash register has a completely different trait-dependent
notion of the food, which is still a static trait because … each cash
register is only for one general category of food, because … it’s
actually a farmer’s market (I’m good at examples!):</p>
<pre tabindex="0"><code>fn extract_grocery_data<T: FarmersMarketStand>(
customer_id: CustomerId,
item: &T::Item,
)-> Result<GroceryItem> {
Ok(GroceryItem {
description: item.read_description()?,
customer_id,
calories: item.calculate_calories()?,
...
})
}
</code></pre><p>Each farmer’s market stand has its own <code>Item</code> type, and the data
from each is extracted and put into this generic structure, so that
I can both process it and send it over the wire. Easy enough!</p>
<h2 id="step-1-a-new-requirement">Step 1: A New Requirement</h2>
<p>I thought this code was well-structured and well-architected, and patted
myself on the back for it! But, as any experienced programmer knows,
the true test of a software architecture is when you get a new
requirement.</p>
<p>It’s when you get a new requirement (including “fix this bug we found”)
that you actually learn if you did a good job with the architecture. It’s
the only objective measure. If you built flexibility in, did it have
anything to do with the new set of requirements? If not, it might
have been over-engineered. Did you make any decisions that made it
unnecessarily inflexible? If so, it might have been poorly engineered. Can
you still even read the code so you can change it? Do you know exactly
where the change fits? Are you tempted to throw the code out and rewrite
it from scratch? Can you even still run it on your machine?</p>
<p>I digress.</p>
<p>The new requirement was quite simple: the farmers wanted us to pipe some
data back to them from the grocery items. They were already connecting
to the TCP stream, but the data we were using to aggregate wasn’t enough.
We had to convey more information in the JSON, and unfortunately, this
information was <code>FarmersMarketStand</code> specific.</p>
<p>Now, we had to add an additional field to our data structure. But
what type should it be? I don’t need to use it for analytics, unlike
the other fields. I just need to get it to the TCP connection so the
farmers can get it right back:</p>
<pre tabindex="0"><code>pub struct GroceryItem {
...
pub halal: bool,
pub kosher: bool,
pub market_specific_data: ???
}
</code></pre><p>Now, if I want to use static polymorphism, I have to add a type parameter
to <code>GroceryItem</code>:</p>
<pre tabindex="0"><code>pub struct GroceryItem<T: Serialize> {
...
pub market_specific_data: T,
}
</code></pre><p>But if I do this, I have to keep on parameterizing all my functions
after this on this new type parameter. Besides, this would mean that I can’t
send all the <code>GroceryItem</code>s through a single channel; I have to have
a separate channel per <code>FarmersMarketStand</code>. Maybe I could figure it out,
but I don’t feel like I should have to, and besides, I’m trying not to
have to rearchitecture half the program.</p>
<p>An alternative prospect is serializing the data
first, since the only thing I’m going to do with it
is serialize it. Then, I can store it in a serialized
form. <a href="https://docs.rs/serde_json/latest/serde_json/"><code>serde_json</code></a>,
which implements <code>serde</code> support for JSON,
has a type and a function just for this purpose:
<a href="https://docs.rs/serde_json/latest/serde_json/enum.Value.html"><code>serde_json::Value</code></a>
and
<a href="https://docs.rs/serde_json/latest/serde_json/fn.to_value.html"><code>serde_json::to_value</code></a>.</p>
<p>That gives us something like this:</p>
<pre tabindex="0"><code>pub struct GroceryItem {
...
pub market_specific_data: serde_json::Value,
}
fn extract_grocery_data<T: FarmersMarketStand>(
customer_id: CustomerId,
item: &T::Item,
)-> Result<GroceryItem> {
let market_specific_data = item.read_market_specific_data()?;
let market_specific_data = serde_json::to_value(&market_specific_data);
...
</code></pre><p>The problem here is, the farmers only connect to the TCP connection
maybe 10% of the time, and the rest of the time, I don’t want to pay
the extra cost of serialization. Plus, I don’t want to pay the cost of
serializing to this intermedaite format, and then to JSON, rather
than serializing directly to JSON.</p>
<h2 id="step-2-dynamic-polymorphism">Step 2: Dynamic Polymorphism</h2>
<p>Now, you might be having an idea right now: Why not use dynamic polymorphism?
This way we can have a little blob that means “I know how to serialize
myself,” but we only have to do the serialization if it actually comes up.
We don’t have to know anything else about the blob, nor do we have to
pass the type all over the place at compile-time with all the baggage
that comes with that.</p>
<p>So you write something like this:</p>
<pre tabindex="0"><code>pub struct GroceryItem {
...
pub market_specific_data: Box<dyn Serialize>,
}
</code></pre><p>… and you find out that <code>Serialize</code> is not
object-safe. You look up the <a href="https://docs.rs/serde/latest/serde/ser/trait.Serialize.html">docs for the <code>Serialize</code>
trait</a>,
and lo and behold! It’s got one method:</p>
<pre tabindex="0"><code>fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
</code></pre><p>Well, why isn’t this object-safe? Well, at a Rust level it’s one
method, but it’s a method uses static polymorphism. At a Rust level,
we might think we just need to store what method to call at run-time,
but actually, by the time we get to run-time, this isn’t a single
method anymore. It will have been <em>monomorphized</em> into a method
per every possible value of <code>S</code>, every possible serializer.</p>
<p>Now, we’re only using the JSON serializer, but there’s no way for
the method to know that. To make a vtable for this method, Rust
would have to write down an implementation of this method for
every possible serializer, which is too many and not a well-defined
set.</p>
<p>OK, well, you might think, why not take advantage of the fact
that we’re just using the JSON serializer? Why not write this:</p>
<pre tabindex="0"><code>trait JsonSerialize {
fn json_serialize(
&self,
serializer: serde_json::Serializer,
) -> Result<
serde_json::Serializer::Ok,
serde_json::Serializer::Error,
>;
}
</code></pre><p>This trait is like <code>Serialize</code>, but because it no longer uses
static polymorphism, it’s now object-safe. Only one time method
is needed per implementing type.</p>
<p>Well, how do we implement this trait? <code>Serialize</code> has a <code>derive</code>
macro, but <code>JsonSerialize</code> does not. However, a type’s
<code>JsonSerialize</code> implementation could just call the <code>Serialize</code>
implementation. And rather than making every farmer at the market do
this for their own type, we can use a blanket implementation
that says if a value is <code>Serialize</code>, it’s also <code>JsonSerialize</code>:</p>
<pre tabindex="0"><code>impl<T> JsonSerialize for T where T: Serialize {
fn json_serialize(
&self,
serializer: serde_json::Serializer,
) -> Result<
serde_json::Serializer::Ok,
serde_json::Serializer::Error,
> {
self.serialize(serializer);
}
}
</code></pre><p>So we can have all the trait implementations for the object-safe
trait be implemented using static polymorphism based on the non-object-safe
trait. This is a common pattern and it’s known as <em>type erasure</em>,
because you’ve erased all the <code><T: Serialize></code> you would otherwise
need everywhere you mentioned the <code>GroceryItem</code> type.</p>
<p>However, this isn’t very good, because we want to use this as
part of a serializable structure:</p>
<pre tabindex="0"><code>#[derive(Serialize)]
pub struct GroceryItem {
...
pub market_specific_data: Box<dyn JsonSerialize>,
}
</code></pre><p>See, when the <code>Serialize</code> derive macro gets to the <code>market_specific_data</code>
field, it doesn’t implement <code>Serialize</code>. It just implements <code>JsonSerialize</code>,
since that’s how we made it object-safe. However, it’s trying to implement
<code>Serialize</code> on <code>GroceryItem</code> – for all serializers, and it’s never
heard of <code>JsonSerialize</code>.</p>
<h2 id="step-3-theres-a-crate-for-that">Step 3: There’s a crate for that!</h2>
<p>At this point, I thought: There’s got to be a way to entirely
type-erase <code>Serialize</code>. The problem with the method in
<code>Serialize</code> is that it’s passed in a statically polymorphic
<code>Serializer</code> – but what if we type-erased <code>Serializer</code>? The
problem with that is <code>Serializer</code> has like a <a href="https://docs.rs/serde/latest/serde/ser/trait.Serializer.html">bajillion
methods</a>,
so we’d have to deal with all of them in our type-erased
version.</p>
<p>My conclusion? It’s possible, but it’d be a <em>lot</em> of work, so much
that it might well be its own crate. And when you have that thought,
well, one possibility is that crate may already exist.</p>
<p>And lo and behold, it does! Allow me to introduce the excellent
<a href="https://github.com/dtolnay/erased-serde"><code>erased-serde</code></a>
by <a href="https://github.com/dtolnay">David Tolnay</a>. It does all of the
work of type erasure for all of <code>serde</code>, and if you’re new to
type erasure, the code is worth a read. It even uses macros!</p>
<p>It called its type-erased trait <code>Serialize</code>, which layered on top of
the non-type erased trait, called <code>Serialize</code>. If your type implemented
<code>Serialize</code>, it automatically implemented <code>Serialize</code> due to a blanket
implementation, which was great, because then you could write <code>Box<dyn Serialize></code>, and would you know that <code>dyn Serialize</code> also had an
implementation for <code>Serialize</code> already done?</p>
<pre tabindex="0"><code>use erased_serde::Serialize as ErasedSerialize;
</code></pre><p>I mean to say: If your type implemented <code>Serialize</code>, it automatically
implemented <code>ErasedSerialize</code> due to a blanket implementation, which
was great, because then you could write <code>Box<dyn ErasedSerialize></code>, and
would you know that <code>dyn ErasedSerialize</code> also had an implementation for
<code>Serialize</code> already done?</p>
<p>This meant, all in all, that I could write this:</p>
<pre tabindex="0"><code>#[derive(Serialize)]
pub struct GroceryItem {
...
pub market_specific_data: Box<dyn ErasedSerialize>,
}
fn extract_grocery_data<T: FarmersMarketStand>(
customer_id: CustomerId,
item: &T::Item,
)-> Result<GroceryItem> {
Ok(GroceryItem {
description: item.read_description()?,
customer_id,
calories: item.calculate_calories()?,
...
market_specific_data: Box::new(item.read_market_specific_data()?),
})
}
</code></pre><p>The cast from <code>Box<impl Serialize></code> to <code>Box<dyn ErasedSerialize></code> is
implicit, and <code>Box<dyn ErasedSerialize></code> implements <code>Serialize</code>, so
the <code>derive</code> macro is happy!</p>
<p>Voilà!</p>
<p>The code is available in a <a href="https://github.com/jhartzell42/groceries-contrived-example">GitHub
repo</a> and
the output shows the power of Rust polymorphism:</p>
<pre tabindex="0"><code>[jim@palatinate:~/hobby/groceries]$ cargo run | jq .
Finished dev [unoptimized + debuginfo] target(s) in 0.00s
Running `target/debug/groceries`
[
{
"description": "Apples",
"customer_id": 0,
"price_in_cents": 3,
"calories": 10,
"grams_protein": 10,
"grams_carbs": 10,
"grams_fat": 10,
"grams_alcohol": 10,
"meat_status": "Veg",
"halal": true,
"kosher": true,
"market_specific_data": {
"variety": "Gala",
"doctors_kept_away": 30
}
},
{
"description": "Bacon",
"customer_id": 1,
"price_in_cents": 3000,
"calories": 10,
"grams_protein": 10,
"grams_carbs": 10,
"grams_fat": 10,
"grams_alcohol": 10,
"meat_status": "Meat",
"halal": false,
"kosher": false,
"market_specific_data": {
"farm_of_origin": "Stolzfus and Sons",
"breakfasts_served": 15
}
}
]
</code></pre><h2 id="step-4-bonus-round-another-requirement">Step 4: Bonus Round: Another Requirement</h2>
<p>Does this work well? Let’s see how a new requirement can be dealt with!</p>
<p>So next I learn that I have to implement <code>Clone</code> on <code>GroceryItem</code>,
for some of the processing code where we do the data metrics.</p>
<p>I might think, well, this should be easy! I have a <code>Box</code>, and
I never write to the inner value, so I just need a cloneable <code>Box</code>,
an <code>Arc</code>. Then, I can <code>#[derive(Clone)]</code>, and the <code>market_specific_data</code>
field will just be multiply-owned.</p>
<p>But, alas, no! This error appears:</p>
<pre tabindex="0"><code>error[E0277]: the trait bound `Arc<dyn erased_serde::Serialize>: _::_serde::Serialize` is not satisfied
</code></pre><p>Why does this work for <code>Box<dyn ErasedSerialize></code> and not
<code>Arc<dyn ErasedSerialize></code>? Well, this is actually quite
straight-forward: There is an <a href="https://docs.rs/serde/latest/serde/ser/trait.Serialize.html#impl-Serialize-for-Box%3CT%3E">implementation of <code>Serialize</code> for
<code>Box<T></code></a>
when <code>T</code> implements <code>Serialize</code>, part of the <code>Serialize</code> crate. It does
not exist for <code>Arc</code>.</p>
<p>I know that I can’t do the same in my own crate, but for <code>Arc</code> instead
of <code>Box</code>:</p>
<pre tabindex="0"><code>impl<T> Serialize for Arc<T>
where
T: Serialize,
{
#[inline]
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
(**self).serialize(serializer)
}
}
</code></pre><p>– because that would violate the dreaded “orphan rule”:</p>
<pre tabindex="0"><code>error[E0117]: only traits defined in the current crate can be implemented for types defined outside of the crate
--> src/main.rs:191:1
|
191 | impl<T> Serialize for Arc<T>
| ^ ------ `Arc` is not defined in the current crate
| _|
| |
192 | | where
193 | | T: Serialize,
194 | | {
... |
201 | | }
202 | | }
| |_^ impl doesn't use only types from inside the current crate
|
= note: define and implement a trait or new type instead
</code></pre><p>But if we know the orphan rule well, or just read the note in the error
message, we know that we can get around it with… you guessed it,
a <em>newtype</em>!</p>
<p><em>Newtype</em>s are named after the Haskell keyword <code>newtype</code>, though in Rust
they don’t use that keyword, so we refer to the “newtype pattern.” In both
Haskell and Rust, they’re the standard way to get around the orphan rule.
The premise is simple: We define a new type that is distinct to the
compiler (so we can’t use <code>type</code>) but not practically distinct. It’s
generally implemented in Rust as a tuple-<code>struct</code> with one field.</p>
<p>There’s two ways to go with this, as this <a href="https://blog.eizinger.io/8593/generic-newtypes-a-way-to-work-around-the-orphan-rule">blog
post</a>
indicates (which surprisingly enough is also about <code>serde</code>!). We can
try and fix the <code>Arc<T></code> problem for everybody with a generic newtype,
or just for ourselves with a regular ol’ newtype.</p>
<p>Here’s how the regular newtype solution looks:</p>
<pre tabindex="0"><code>#[derive(Clone)]
pub struct MarketSpecificData(Arc<dyn ErasedSerialize>);
impl Serialize for MarketSpecificData {
#[inline]
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
(*self.0).serialize(serializer)
}
}
#[derive(Serialize, Clone)]
pub struct GroceryItem {
pub description: String,
pub customer_id: CustomerId,
...
pub kosher: bool,
pub market_specific_data: MarketSpecificData,
}
</code></pre><p>If your takeaway at this point is that writing <code>trait</code>-heavy code
involves a lot of functions that call other functions with the same
name and almost the same arguments, you’re not wrong.</p>
<p>However, in this case it was all unnecessary, as it turns out that
we <em>can</em> get support for <code>Arc<T></code> from <code>serde</code> itself if we enable
the <code>rc</code> feature:</p>
<pre tabindex="0"><code>[jim@palatinate:~/hobby/groceries-contrived-example]$ cargo add --features rc,derive serde
Updating crates.io index
Adding serde v1.0.143 to dependencies.
Features:
+ derive
+ rc
+ serde_derive
+ std
- alloc
- unstable
</code></pre><p>Both versions are available in the example repo’s <a href="https://github.com/jhartzell42/groceries-contrived-example/pull/1"><code>clone</code> branch</a>.</p>
Write Everything Down (Part 1)https://www.thecodedmessage.com/posts/write-everything-down/2022-08-10T00:00:00+00:00Memory Leak I have an excellent memory. I have a terrible memory.
Well, which one is it?
This is a confusing state to be in. It can be frustrating to people around me. How is it – my father used to ask me when I was in high school – that I could remember all the lessons and readings for my tests in school, and get all the good grades, but couldn’t ever remember to do the simplest task or household chore, or to bring with me the simplest item?<h1 id="memory-leak">Memory Leak</h1>
<p>I have an excellent memory. I have a terrible memory.</p>
<p>Well, which one is it?</p>
<p>This is a confusing state to be in. It can be frustrating to people
around me. How is it – my father used to ask me when I was in high
school – that I could remember all the lessons and readings for my tests
in school, and get all the good grades, but couldn’t ever remember to do
the simplest task or household chore, or to bring with me the simplest
item? And of course the fact that I remember these conversations from
so long ago is a bit of a case in point.</p>
<p>So I’d like to introduce a distinction between different kinds of memory,
a technical distinction made by mnemologists, which is what I wish people
who studied memory were called:</p>
<ul>
<li><strong>Retrospective memory</strong> is the memory we normally think of when we
hear the word “memory”: it is the memory of <em>what was</em>. It is the
type of memory that you use to look through the past. It allows you
to recollect when you first met someone, or to explain the Fundamental
Theorm of Calculus after a decade. Think of it like a database, where
you can make queries against it.</li>
<li><strong>Prospective memory</strong> is less famous but equally important and equally
deserving of the word “memory”: it is the memory of <em>what to do</em>. It is
the type of memory that you use to make sure all your responsibilities
and goals are handled. It allows you to remember to pick up milk for
your partner, or to copy-edit before publishing your blog post. Think
of it like the notifications from all the annoying apps on your phone;
rather than you querying it, this type of memory makes requests of you.</li>
</ul>
<p>My retrospective memory is solid, even legendary. I remember obscure
conversations from years, even decades, ago – and I still feel some
type of way about them! More privately but more troublingly, I remember
embarrassing things I’ve done all the way back to Kindergarten, and
still feel some type of way about those too. Whatever part of my mind
is responsible for this type of memory and its concomitant long-term
annoyance and shame is in perfect working order – in fact, I think I’d
rather like to get hit in the exact right spot in the head to take that
part of my brain down a peg! (Although, to be fair to that portion of my
brain, it did help quite a bit with exams when that was part of my life,
and it helps a lot now with remembering essential programming facts for
my job, not to mention essential linguistics facts to tell my friends.)</p>
<p>My prospective memory, on the other hand… Well, I never remember to
bring with me the things I need, prompting my German professor
in college to ask me, “Ist es der Fall, dass du <em>niemals</em> deinen
Kuli zur Klasse bringst?” (Is it the case that you <em>never</em> bring your
pen to class?) It was indeed; I <em>never</em> did.</p>
<p>I will even regularly forget to brush my teeth in the morning, only to
remember when I smell my own breath. I could put up a post-it note
to remind myself, but I would forget to look at the post-it note.</p>
<p>I always used to forget to take my lunchbox from one class to another
in school, so my mother ended up buying extra lunchboxes, and my lost
lunchboxes wandering the school became such a trope that one group of
people started calling me “Lunchbox” after how many times I’d abruptly
run to a previous class to get one, or claim an old one that someone
found with stinking spoiled food.</p>
<p>And those “senior moments” my parents and grandparents would joke about
when I was a kid, where you’d walk into a room and forget what you
were going to do? I felt like I was having those as often as an actual
senior at 16, or even at 12… Imagine how I’ll be when they actually
are age appropriate!</p>
<p>But more importantly, for me, this gets me in all the little things
of life: the chores and the errands, and all the baby steps towards
achieving my goals. I may have to check my mail, because a check is coming –
or even if there is no check, if your mailbox is full they eventually
stop bringing the mail.</p>
<p>I may have to make sure that when I next travel to my hometown, I bring along
a book that I have to give back to a friend there, because people don’t
understand how likely I am to fail at that task, insist it’s easy and
that I’ll be able to do it, and then get really frustrated – and I feel
really bad – when their book that they intended to have me deliver ends
up disappearing to whatever afterlife objects go to when I no longer am
exactly sure where they are. It’s bad enough when my own stuff goes there.</p>
<h1 id="write-everything-down">Write Everything Down</h1>
<p>I can whine all day about my particularly – even clinically –
bad (prospective) memory. But I also know that even people with
prospective memory much better than mine can benefit from organizational
techniques. That’s why civilizations throughout history have created
technology to augment our memory – both prospective and retrospective.</p>
<p>The biggest, grandest, most impactful such technology – the
one that not only changed the course of history, but
allowed history as we know it to even be possible – is of course
writing. Whether implemented by using reeds to stamp <a href="https://en.wikipedia.org/wiki/Cuneiform">symbols into
clay</a> or using pens and
ink to draw <a href="https://en.wikipedia.org/wiki/Alphabet">letters</a>
on parchment or vellum or paper, or making
<a href="https://en.wikipedia.org/wiki/Printing_press">stamps</a>
for those same letters, or storing the symbols in a <a href="https://en.wikipedia.org/wiki/Unicode">binary
encoding</a> in computer memory,
writing has spanned many technological and civilizational eras to augment
our brains, to help us record bits of language, whether retrospective or
prospective, whether it’s about things that <em>have</em> happened, or things
that have <em>to</em> happen, beyond our natural brain capacity.</p>
<p>There are deep social consequences to this technology we call “writing.”
Since it augments both prospective and retrospective memory, and literacy
is now common-place and expected, standards have risen in society. A
more and more complex society – with its chores and bureaucratic
paperwork – requires more and more prospective memory to handle it.
The advances of writing have been almost entirely consumed by increases
in societal demand for organization. More and more jobs have become
abstract instead of concrete, and also have required more and more
prospective memory. Since writing is available to everyone, the ability
to use it well has transitioned from being an edge and an advantage to
being a necessity. For many jobs and many lives, even normal or good
prospective memory is no longer enough without the aid of writing.</p>
<p>As someone with poor prospective memory, I have personally found writing
to be invaluable. But it hasn’t always come naturally to me to use
it as intensely as I have to. Most organizational practices assume a
certain baseline of prospective memory and focus, often higher than I
naturally have. Therefore, the way I use it can be peculiar, sometimes
even intrusive: It is only recently that I’ve become comfortable using
my system fully in my social life, with the confidence to pause my
friend if they casually say something that generates a TODO item, so
I can send myself an e-mail and make sure that that TODO item actually
happens. Waiting till the end of the conversation like a normal person
won’t work for me, and certainly neither will “just remembering” to do
the thing.</p>
<p>See, I need to write <em>everything</em> down, every little obligation I incur,
and a lot of things that I feel the need to write TODO e-mails for are
things that other people would naturally remember. But that’s the thing:
I won’t. Or at least, I don’t trust myself to. And I trust my judgment
about my memory better than my friends’. So please excuse me as I whip
out my phone to send myself an e-mail – I’ve gotten very fast at it.</p>
<p>And because text messages cannot be marked unread, often just asking
for a text message with a recommendation (of something to read or listen to)
isn’t enough. I often need to write a corresponding e-mail to myself to
remember to go find that text message and actually do the thing.
So it’s not just “text me the article,” it’s “text me the article while
I make a note of it on my phone through e-mail.”</p>
<h1 id="you-need-a-system">You Need a System</h1>
<p>Just “use writing” (or its digital equivalents) isn’t a complete answer.
And it wasn’t just a lack of confidence interrupting conversations
to write things down that was holding me back before. There are a lot
of other questions that need to be answered:</p>
<ul>
<li>Where should I write things?</li>
<li>What apps should I use, if any?</li>
<li>How should these notes be organized?</li>
<li>What tasks are required to keep them organized that way?</li>
</ul>
<p>As I said before, if I were to just write everything on post-it notes,
I would forget to look at the post-it notes. There’s a reason that “buy
a planner” is widely panned as bad ADHD advice – ADHD makes using the
planner hard, and the details of how you use that planner are equally,
if not more, important. There needs to be a <em>system</em>, and that system
comes with additional <em>chores</em>, and since this is the system that tells
you what other chores to do, those chores need to become a <em>habit</em>.</p>
<p>As everyone’s brain works differently (whether ADHD or not), people differ
tremendously in what their ideal organizational systems are. For me,
I am much less productive if I have a less than ideal system – the
stakes are very high. But even for people who can be productive on any
system, I think that tailoring their system to their brain, their lifestyle,
their job and schedule and hobbies, can have amazing results.</p>
<h1 id="future-posts">Future Posts</h1>
<p>In my next <a href="https://www.thecodedmessage.com/tags/organization">organization</a> post, I shall go over some
systems that <em>don’t</em> work for me, and why they don’t. Finally, I shall
lay out my system, which I already have discussed some already in a
<a href="https://www.thecodedmessage.com/posts/crank-em-out/">previous post</a>.</p>
Blocking Sockets and Asynchttps://www.thecodedmessage.com/posts/blocking-sockets/2022-08-08T00:00:00+00:00Using async in Rust can lead to bad surprises. I recently came across a particularly gnarly one, and I thought it was interesting enough to share a little discussion. I think that we are too used to the burden of separating async from blocking being on the programmer, and Rust can and should do better, and so can operating system APIs, especially in subtle situations like the one I describe here.<p>Using <code>async</code> in Rust can lead to bad surprises. I recently came across
a particularly gnarly one, and I thought it was interesting enough to
share a little discussion. I think that we are too used to the burden
of separating <code>async</code> from blocking being on the programmer, and
Rust can and should do better, and so can operating system APIs,
especially in subtle situations like the one I describe here.</p>
<p>Every <code>async</code> programmer learns early on not to call a blocking function
from an <code>async</code> function. If you do, it is a hidden color violation,
as I discuss in a <a href="https://www.thecodedmessage.com/posts/async-colors">previous post</a>. By “hidden,” I
mean that unlike other color violations, Rust gives you no compiler-time
help. You just have to use discipline. You just have to “make sure not
to do it.” You just have to increase your cognitive load. It is a rule
that the computer is no help with – which means that you’ll definitely
mess it up at some point, possibly at many points.</p>
<p>Unfortunately, it’s also a gnarly problem to debug. The actual blocking
function call will quite possibly work just fine. It’ll return when
the resource is ready, and block until then – probably exactly what
you wanted. It’s the rest of the system that falls apart – other
tasks on the same thread starve, tasks that are depending on them
for progress also starve, but meanwhile other tasks might proceed
without a problem. Worse, there’s no guarantee that the
bug will manifest every time, so the bug isn’t readily
<a href="https://www.thecodedmessage.com/posts/reproducibility">reproducible</a>.</p>
<p>You might think this is an easy problem to address, either through
improvements in the programming language or better programming
discipline.</p>
<p>At a programming language level, you could imagine Rust having
some sort of generalization of <code>unsafe</code>, or maybe an effects system.
Functions that block would have <code>blocking</code> as part of their
signature. Calling a <code>blocking</code> function from an <code>async</code>
function would then be an error, with a way out for functions like
<a href="https://docs.rs/tokio/latest/tokio/task/fn.spawn_blocking.html"><code>spawn_blocking</code></a>.</p>
<p>Unfortunately, Rust doesn’t have this feature, so we have to rely on
programmer discipline. The discipline seems easy enough: If you’re in
an async function, and you call a function that’s going to take some
time or do I/O, make sure you’re doing an async call, which in most
cases means using the <code>async</code> keyword.</p>
<p>Unfortunately, this doesn’t work 100% of the time, because the operating
system isn’t on board. There are system calls that block sometimes, based
on dynamic configuration. Does the <code>recv</code> system call block? Well, that
depends on whether the socket is a blocking socket, or a non-blocking
socket. Fundamentally, <code>recv</code> is run-time polymorphic on socket type,
in a way that makes it a different <a href="https://www.thecodedmessage.com/posts/async-colors"><em>color</em></a> based
on run-time information.</p>
<p>This is bad design: BSD should have split <code>recv</code> into two system calls,
<code>recv</code> or <code>recv_nonblock</code>. <code>recv</code> could error if given a non-blocking
socket, and <code>recv_nonblock</code> could error if given a blocking one.
Linux at least has a flag <code>MSG_DONTWAIT</code> that makes an individual
<code>recv</code> call unconditionally non-blocking, but it’s non-standard. It’s
not supported on macOS and <code>tokio</code>/<code>mio</code> understandably doesn’t use it.</p>
<p>Most of the time, this isn’t an issue. Sockets controlled through <code>tokio</code>
or other async runtimes are always configured with the operating system to
be non-blocking, as an invariant on those socket types. Sockets controlled
through <code>std</code> or other libraries will be blocking, and will be contained
in completely different Rust types. The Rust type system is used to keep
track of the distinction even if the operating system won’t.</p>
<p>But this becomes an issue where these boundaries are
broken, namely in conversion functions between them. These
methods then have whether or not a socket is blocking
as part of their contract. For example, the documentation for
<a href="https://docs.rs/tokio/latest/tokio/net/struct.TcpStream.html#method.from_std"><code>TcpStream::from_std</code></a>
says:</p>
<blockquote>
<p>This function is intended to be used to wrap a TCP stream from the
standard library in the Tokio equivalent. The conversion assumes nothing
about the underlying stream; it is left up to the user to set it in
non-blocking mode.</p>
</blockquote>
<p>Thus, as a precondition of calling the <code>from_std</code> function, you
must pass a “non-blocking” socket. If you instead did not set the
socket as non-blocking – perhaps because you were making it with some
extra options you needed, but assumed that <code>tokio</code> would handle
the non-blocking part – bad things happen.</p>
<p>If blocking were considered a safety issue, this function would
be marked <code>unsafe</code>. But it’s not, and so it’s simply an unchecked
precondition – and we’re not used to those in Rust. Most
safe functions check their preconditions, either returning
a special value (like an <code>Err</code>) or panicking if something is wrong.
The ones that don’t are typically marked <code>unsafe</code>. Unchecked preconditions
still exist – they cause rogue behavior but not behavior deemed
“unsafe” under Rust’s definition – but they are rare, and
therefore surprising to a Rust programmer.</p>
<p>Why is it not a checked precondition? That’s easy to answer: Checking
it would take an extra system call, as would unconditionally setting it
unblocked in that system call itself. System calls are slow, and that
would be an unacceptable performance penalty for many applications.</p>
<p>This leads to a dissapointing end result, though. It’s not enough
to simply make sure you don’t call I/O methods unless they come
with an <code>async</code> version. To be disciplined enough to be an <code>async</code>
Rust programmer, you also have to watch out for these extra unchecked
preconditions.</p>
<p>Otherwise, you get a hidden color bug that’s even harder to track down
because the blocking functions you’re calling don’t look blocking.
<code>tokio</code> calls <code>recv</code>, thinking it’s not blocking, but it is. You
expect <code>tokio</code> to be correct, but because of this broken invariant, it
isn’t. These sorts of issues can be very hard and time-consuming to debug.</p>
Why Rust should only have provided `expect` for turning errors into panics, and not also provided `unwrap`https://www.thecodedmessage.com/posts/2022-07-14-programming-unwrap/2022-07-14T00:00:00+00:00UPDATE 2: I have made the title longer because people seem to be insisting on misunderstanding me, giving examples where the only reasonable thing to do is to escalate an Err into a panic. Indeed, such situations exist. I am not advocating for panic-free code. I am advocating that expect should be used for those functions, and if a function is particularly prone to being called like that (e.g. Mutex::lock or regex compilation), there should be a panicking version.<p><strong>UPDATE 2</strong>: I have made the title longer because people seem to be
insisting on misunderstanding me, giving examples where the only
reasonable thing to do is to escalate an <code>Err</code> into a panic. Indeed,
such situations exist. I am not advocating for panic-free code.
I am advocating that <code>expect</code> should be used for those functions,
and if a function is particularly prone to being called like that
(e.g. <code>Mutex::lock</code> or regex compilation), there should be a panicking
version.</p>
<p><strong>UPDATE</strong>: <a href="https://blog.burntsushi.net/unwrap/">This post</a>
by Andrew Gallant, author of the excellent
<a href="https://github.com/BurntSushi/ripgrep"><code>ripgrep</code></a>, is a good
overall discussion of the topic I am trying to address here. I
basically entirely agree with it and recommend it as very
educational; specifically, I disagree only in that I think
that linting for <code>unwrap</code> is a good thing, for the reasons
he acknowledges but ultimately does not find compelling in <a href="https://blog.burntsushi.net/unwrap/#should-we-lint-against-uses-of-unwrap">that
section</a>. In his own terms, I just think that the juice <em>is</em> worth the squeeze.</p>
<p>I see the <code>unwrap</code> function called a lot, especially in example code,
quick-and-dirty prototype code, and code written by beginner Rustaceans.
Most of the time I see it, <code>?</code> would be better and could be used instead
with minimal hassle, and the remainder of the time, I would have used
<code>expect</code> instead. In fact, I personally never use <code>unwrap</code>, and I even
wish it hadn’t been included in the standard library.</p>
<p>The simple reason is that something like <code>expect</code> is necessary and
sometimes the best tool for the job, but it’s necessary rarely and should
be used in the strictest moderation, just like panicking should be used
in strictest moderation, and only where it is appropriate (e.g. array
indexing, for reasons I elaborate on later). <code>unwrap</code> is too easy and
indiscriminate, and using it at all encourages immoderate use.</p>
<p>This has turned out, much to my surprise, to be a somewhat controversial
stance, and so I’d like to take some time to explain why I feel that way.</p>
<p>I’ll begin by reviewing what <code>Result</code> is and what options we have
for dealing with its recoverable errors.</p>
<h1 id="results-and-what-to-do-with-them"><code>Result</code>s and what to do with them</h1>
<p>Rust is widely and rightly praised for its use of
<a href="https://doc.rust-lang.org/std/result/enum.Result.html"><code>Result</code></a> for
recoverable error handling. Instead of using exceptions like C++, which
propagate invisibly and surprisingly, or using sentinal values like <code>NULL</code>
and <code>-1</code>, Rust has <a href="https://en.wikipedia.org/wiki/Tagged_union">sum types</a>
and thus, a function can return a value that is either an error (of a
specified, potentially narrow type) or the value we want:</p>
<pre tabindex="0"><code>#[derive(Copy, PartialEq, PartialOrd, Eq, Ord, Debug, Hash)]
#[must_use = "this `Result` may be an `Err` variant, which should be handled"]
pub enum Result<T, E> {
Ok(T),
Err(E),
}
</code></pre><p>If we have a function call <code>foo()</code> that can fail and therefore returns
a <code>Result</code>, we have a few different tactics we can use to handle it:</p>
<ul>
<li><strong>Ignore</strong>: We can ignore the return value, and therefore also ignore
whether it errors. This is almost never what we actually want, and so the
<code>#[must_use]</code> annotation on <code>Result</code> causes a warning to be issued:</li>
</ul>
<pre tabindex="0"><code>foo(); // WARNING
</code></pre><ul>
<li><strong>Manual</strong> We can manually <code>match</code> on the return value and do different
things:</li>
</ul>
<pre tabindex="0"><code>match foo() {
Ok(value) => do_something(value),
Err(err) => handle_error(err),
}
</code></pre><ul>
<li><strong>Propagate</strong>: We can propagate the error ergonomically using the <code>?</code> operator.
This makes <code>Result</code> work like exceptions in many of the good ways,
while cluing in the reader to the additional control flow, which
is good:</li>
</ul>
<pre tabindex="0"><code>foo()?;
</code></pre><ul>
<li><strong>Panic with custom message</strong>: We
can transform the error into an “unrecoverable”
<a href="https://doc.rust-lang.org/book/ch09-01-unrecoverable-errors-with-panic.html"><code>panic</code></a>
using <code>expect</code>, which takes a string argument which is used to customize
the error message:</li>
</ul>
<pre tabindex="0"><code>foo().expect("foo error");
</code></pre><ul>
<li><strong>Panic without custom message</strong>: We can transform the error into a
panic with <code>unwrap</code>, which does not take a string argument and therefore
leads to more generic error messages:</li>
</ul>
<pre tabindex="0"><code>foo().unwrap();
</code></pre><p>Most of the time, in production code, we will want to go with the
<strong>propagate</strong> option, especially in library code where the application
will likely have a better notion of what to do with the error. This
option makes the flow control clearer, tends to result
in better error messages when the error messages are ultimately outputted,
and gives the calling functions more options.</p>
<p>The <strong>manual</strong> option is useful even in a library for when an error
is in fact recoverable at that particular point (e.g. by retrying). In
an application, we’re often at a point where it makes sense to report
the error (via a log message or console output or user-facing error
message).</p>
<p>Sometimes, errors are in fact no big deal, and should be suppressed
completely, but this is better expressed through the <strong>manual</strong> option
at the application layer, with a comment explaining why the right thing
is to do nothing.</p>
<p>But sometimes (and <em>only</em> sometimes), <strong>panic</strong> is appropriate, and
for that, there are two options, <code>unwrap</code> and <code>expect</code>. I always prefer
<code>expect</code> for this, and pretend that <code>unwrap</code> doesn’t exist, because it
makes panicking too easy. To explain, I’d like to discuss in what
situations I think panics are appropriate.</p>
<h1 id="when-to-panic">When to panic?</h1>
<p>Escalating an <code>Err</code> result to a panic should be done
in similar situations to when <code>panic</code> is appropriate
in general, which the Rust book offers some <a href="https://doc.rust-lang.org/book/ch09-03-to-panic-or-not-to-panic.html">guidance
on</a>.</p>
<p>The most clear-cut case is when a code path is a <em>logic error</em>, when
the error is only possible if the programmer has made a mistake and an
invariant has been broken.</p>
<p>A typical example is array indexing. We often find ourselves with
an array index that we (think we) know is valid and we want to use it to
index an array, because we got it from looping or otherwise operating on
the array bounds. We’re not so confident that we want to use <code>unsafe</code>
and do an unchecked array access – that could result in a security
vulnerability if we’re wrong – but it would also be nonsensical to try
to recover from such an invalid access.</p>
<p>For array indexing, this is actually the most
common scenario, and so the index operator in Rust
actually panics for us if we specify an out-of-bounds
index. An <code>unsafe</code> checked array indexing method container <a href="https://doc.rust-lang.org/std/primitive.slice.html#method.get_unchecked">is
available</a>,
as is one that <a href="https://doc.rust-lang.org/std/primitive.slice.html#method.get">never panics and instead returns an
<code>Option</code></a>,
but most of the time we want the panicking checked operation
and so that is the version that gets the <a href="https://doc.rust-lang.org/std/primitive.slice.html#impl-Index%3CI%3E">syntactic
sugar</a>:
<code>arr[index]</code> will neither memory corrupt nor return a recoverable
error on an invalid index, but instead panic.</p>
<p>This is definitely the best default for array indexing. But sometimes,
logic errors result in recoverable errors, in <code>Err</code> (or its equivalent
in the <code>Option</code> world, <code>None</code>). For example, maybe you have an array-like
data structure in which only a <code>get</code> method is available, which
returns an <code>Option</code>. If you were confident in your indexing, you
would want to panic on <code>None</code>, and you can call <code>expect</code> or <code>unwrap</code>
to make that happen.</p>
<p>It is certainly more ergonomic to write <code>expect</code> than to do the
<code>match</code> manually, and less likely to lead to mistakes:</p>
<pre tabindex="0"><code>let val = arr.get(i).expect("i should be valid index");
let val = match arr.get(i) {
Some(val) => val,
None => panic!("i should be valid index"),
}
</code></pre><p>Besides logic errors, panics are also relevant in test cases, where
they are used to indicate test failure.</p>
<h1 id="no-need-to-panic-propagation-made-easy">No need to panic: Propagation Made Easy</h1>
<p>However, <code>expect</code> and <code>unwrap</code> – especially <code>unwrap</code> – are also
amenable to overuse and misuse.</p>
<p>Perhaps you’re doing prototyping and just need something that works most
of the time, or you’re writing a simple app with limited error-handling
needs. Some people use <code>unwrap</code> and <code>expect</code> for this situation, but
I don’t. I use <code>?</code> even in that situation, because I never know when
prototype code might have to escalate to production code – either
so suddenly there’s no time for me to intervene and improve the
error handling or so gradually there’s no occasion for it and it
never gets prioritized. Fixing crappy usage of <code>?</code> in such a situation
is way easier and more likely to happen than fixing a bunch of
<code>expect</code>s or <code>unwrap</code>s.</p>
<p>How can I prototype with <code>?</code>? Doesn’t it require a lot of extra work,
compared with <code>unwrap</code>? Honestly, not really. Writing <code>Result<Foo></code>
is not substantially harder than writing <code>Foo</code> for functions which
can error. As for converting between error types, libraries like
<a href="https://crates.io/crates/eyre"><code>eyre</code></a> and
<a href="https://docs.rs/anyhow/latest/anyhow/"><code>anyhow</code></a> exist so that
all errors can be included.</p>
<p>Example code similarly can be written with <code>?</code>. This is important because
Rust is rapidly growing and has a lot of new programmers using it. They
see that a function returns a thing, and want to get to the thing and
don’t know how to, and they see <code>unwrap</code> in the example code and they
cargo cult it. Even if they have learned a thing or two about Rust,
it does have the perfect type signature for their problem, and so they
jump on it, and end up using it in prototype code and then trying to use
it in production code. Perhaps they know about <code>?</code>, but it has a higher
barrier to entry, and so they’ll procrastinate learning about it.</p>
<p>In these situations, <code>unwrap</code> provides an easy, ergonomic
way of calling a function that might error, and so it’s very tempting,
like <a href="https://en.wikipedia.org/wiki/Desire_path">walking through the grass when there’s a paved path
available</a>. However,
<code>?</code> is generally preferable to <code>unwrap</code> or <code>expect</code>, and so the
relative easiness is misaligned to the order of preferences.</p>
<p>And unfortunately, once code has been written using <code>unwrap</code>
or <code>expect</code> heavily, it’s hard to adapt it to use <code>?</code> and propagation,
especially if those interfaces have come to be relied upon.</p>
<h1 id="why-i-prefer-expect-to-unwrap">Why I prefer <code>expect</code> to <code>unwrap</code></h1>
<p>There are definitely legitimate use cases to turning an error
into panic, but they are relatively rare, especially if the
code is well-factored. Turning an error into a panic is also
extremely tempting to be abused. The second situation is more
common than the first, so in many codebases, the bad <code>unwrap</code>s
and <code>expect</code>s, the sloppy “OK for now” ones or the “it’s just
an example” ones outnumber the legitimate use cases.</p>
<p>Raising the barrier to entry seems like a good solution, and <code>expect</code>
seems like the perfect balance. The error string can also serve
as documentation of why this decision was made, like comments for
unsafety. The fact that <code>expect</code> is a little less ergonomic is a
feature, as it discourages casual use. <code>expect</code> has enough convenience
to encapsulate the concept of escalation from a “recoverable” error
to an “unrecoverable” error, but not so much that it competes with <code>?</code>
in ergonomics.</p>
<p><code>expect</code>’s error message can serve as a comment as to why the
panic is justified. Comments are a good thing, and for as
questionable an operation as escalating an <code>Err</code> to a panic,
it’s useful to explain why we think it will never happen even if
we think it’s obvious. Like the comments recommended for <code>unsafe</code>
blocks, I think that <code>expect</code> is a situation that deserves some
indication to the reader as to why the author thinks this is OK.</p>
<p>Why have this in the error message rather than just a comment?
<code>expect</code>’s error message is also helpful in debugging. <code>unwrap</code> can give
good error messages, printing the error value and providing a backtrace,
but in other configurations and deployments you might not see a backtrace
and the error value might not be useful. Some <code>unwrap</code> calls might
provide good enough error messages sometimes, but it doesn’t work 100%
of the time, so it can’t be relied upon – especially when <code>expect</code> is
readily available. Especially in the case of a logic error,
when the condition was thought impossible, debugging will already
be hard, and the person doing the debugging needs all the help they
can get.</p>
<h1 id="objections">Objections</h1>
<p>When I’ve expressed my opinions about <code>unwrap</code> before, one objection
stands out in my mind as particularly interesting and particularly
valid. I say above that legitimate use cases to turning an <code>Err</code> into
a panic are rare, which is generally true, but sometimes
can seem false. There are certain APIs where it comes up a lot,
APIs where <code>Err</code>s frequently are actually logic errors.</p>
<p>For example, regular expressions. The
<a href="https://crates.io/crates/regex"><code>regex</code></a> crate uses a method called
<a href="https://docs.rs/regex/latest/regex/struct.Regex.html#method.new"><code>new</code></a>
that is used to prepare regular expressions. It is practically always
called on a constant string, making any failure a logic error, which
should result in a panic, as discussed above. However, this same <code>new</code>
method returns a <code>Result</code>, necessitating an <code>unwrap</code> or an <code>expect</code>
to make the logic error into a panic. Am I seriously suggesting that
the poor user write <code>.expect("bad regular expression")</code> instead of
<code>.unwrap()</code> every time?</p>
<p>Well, that puts regex compilation in the same category as array
indexing in my mind, and means that the <em>default</em> regex compilation function
should panic on the user’s behalf (of course, the <code>Result</code> version
should still be possible, just as <code>get</code> is a possible function for
slices).</p>
<p>Similarly, when I’ve expressed my opinions about <code>unwrap</code>, some have
assumed I’m opposed to panics altogether, and asked me if I used array
indexing, implying that if I accept the possibility of panics in array
indexing, I should accept the possibility of panics in <code>unwrap</code> as well.</p>
<p>For both of these objections, I want to clarify something: I’m not
opposed to panicking in logic error situations. But that does not imply
that <code>unwrap</code> is a good idea. Most <code>Err</code>s are not logic errors, and so
converting one to a panic should be a little inconvenient, and should
require the user to think enough to write an error message.</p>
<p>For those situations where an error is actually likely to be a logic
error, such as array indexing or regex compilation, returning <code>Result</code>
need not be the function’s default behavior. Perhaps the author of
<code>regex</code> can make <code>new</code> panic on compiler error, and another
function can be written for when the regex in question was user
inputted, or where a regex compilation error would not be a logic error.</p>
<p>In general, when you find yourself using <code>expect</code> or <code>unwrap</code> over and
over again in the same way, and you’re sure it’s legitimate each time,
do what you do with all smelly-seeming code if you know it’s actually
the right thing in spite of the smell: Wrap it in an abstraction. Put
it in a function that calls <code>expect</code> to panic on error.</p>
<p>This is not cheating. This new, panicking function would instead serve
as a documentation for the fact that in this context, an <code>Err</code> is in
fact likely to be a logic error, a tangible paper trail that someone
made a conscious call that, as a policy, panicking is appropriate in
this instance. The decision to panic instead of returning an <code>Err</code> in
this situation is made in one place instead of many, where it can be
explained in a detailed comment if desired, and where it certainly won’t
be too much of a burden to use <code>expect</code> instead of <code>unwrap</code>. Even the
fact of the function existing and having a panic-based interface is a
signal from the library author that they have thought about this issue,
and deemed the situation to be more analogous to array indexing than,
say, a file-not-found.</p>
<h1 id="tendencies-and-statistics">Tendencies and Statistics</h1>
<p>In any case, array indexing and regex compilation are the exceptions,
not the rule.
Almost all bounds checks failures may be logic errors.
Almost all regex compilation errors may be logic errors. Making these
functions panic would indeed do little damage, as panicking is almost
always the right move.</p>
<p>But – and this is a big “but” – most functions, when they return
<code>Err</code>, genuinely are signalling recoverable errors, and <code>unwrap</code>
doesn’t discriminate – it works equally well on all of them, in
the inappropriate situations as well as the appropriate situations.
With array indexing and regex compiling, the nature of the function being
called gives some indication of why it’s a logic error; with <code>unwrap</code>,
there is no indication.</p>
<p>Generally, this argument is in terms of statistics and human nature, not
in terms of absolutes. Turning an <code>Err</code> into a panic should be rare, not
necessarily in terms of how often it happens, but in how often it shows up
in code. If it is common, either the programmer is using bad practices,
and should be using better practices, or the API has a design flaw, and
that needs to be fixed. In either case, <code>expect</code> is better than <code>unwrap</code>.</p>
<p>Ideally, we don’t get used to seeing <code>expect</code> and <code>unwrap</code> being used
all the time. We don’t get used to casually panicking on <code>Err</code>, but instead
treat panicking like an operation that should be considered carefully,
whether once for all instances of a specific call (as in array indexing
or regex construction), or on a case-by-case basis (for other uses of
<code>expect</code>).</p>
<p>Humans are creatures of habit and lazy by nature. <code>unwrap</code> is a powerful
tool, a way to get around the type system, and as such, we might find
ourselves addicted to it. We should treat even <code>expect</code> as mildly
suspicious, something only to be used with consideration, something
to be wrapped behind an abstraction (as in the regex case). <code>unwrap</code>
is even more dangerous, because it is easier, and given that legitimate
usage of <code>except</code> should be rare (again in terms of lines of code, not
frequency of invocation) and hidden behind an abstraction when it is
common in frequency of invocation, I see no need for <code>unwrap</code> to exist.</p>
<h1 id="context">Context</h1>
<p>I am aware that removing <code>unwrap</code> from Rust is not a viable option at
this point, which is why I said that I wish it was never put in Rust to
begin with. I am aware that <code>unwrap</code> is used in the Rust compiler,
and that there is no consensus to avoid <code>unwrap</code> to the level that I
avoid it.</p>
<p>I will however note that the documentation
of <code>unwrap</code> comes with <a href="https://doc.rust-lang.org/std/result/enum.Result.html#method.unwrap">a warning not to use
it</a>.
The warning is framed in terms of the fact
that <code>unwrap</code> may panic, but the <a href="https://doc.rust-lang.org/std/result/enum.Result.html#method.expect">documentation of
<code>expect</code></a>,
where this is equally true, does not come with such a warning.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Escalating an <code>Err</code> to a panic is sometimes appropriate. But it should
be a considered choice, either on a function-by-function basis (through
a wrapper function calling <code>expect</code> or a different choice of interface), or
on a case-by-case basis. In either case, <code>unwrap</code> makes it too easy.</p>
<p>Including an error message, and documenting why a panic is appropriate
(either through the error message or separately) should not be too much
to ask. If it is, that’s a code smell. The fact that <code>expect</code> is more
difficult is a feature.</p>
<p>In this article I have mentioned only briefly the other motivation for
using <code>expect</code> – better error messages for debugging. I thought the
code smell argument was more important. But debuggability can be very
important as well, so I’ll discuss it briefly here. I don’t think it’s
safe to assume backtraces will always be available. I don’t think it’s
safe to assume every use of <code>unwrap</code> will print a useful error message,
even if it sometimes can. Maybe an individual use of <code>unwrap</code> in one
context does not cause this problem, but once <code>unwrap</code> is established
as acceptable, it opens the door for it to be abused.</p>
<p>I personally do not use <code>unwrap</code>, nor do I sign off on code that does.
I even prefer <code>expect("foo")</code> to <code>unwrap</code>, because it signals that it’s
off-the-cuff example code and shows that the person writing it knows
that more consideration would be needed to put it into production.
Please consider joining me in this approach.</p>
<p>If you do not want to implement so strict a policy, and you think I’m too
extreme in this way, hopefully this article at least makes my argument
clearer, and explains why I do not call <code>unwrap</code> but still feel
comfortable indexing my arrays. Hopefully also this has given food
for thought about <code>Result</code>s, errors, and panics.</p>
<h1 id="edits">Edits</h1>
<p>This post has been edited to clarify certain things, including a
clarification in the opening to the post to make sure my overall position
is easily comprehensible.</p>
Fiction Review: Plain Truthhttps://www.thecodedmessage.com/posts/review-plain-truth/2022-07-06T00:00:00+00:00I enjoyed Plain Truth by Jodi Picoult. I finished it a couple of months ago, when I was feeling very restless and impatient about everything going on in my life. At the time, I desperately needed fun books to read, but I was simultaneously having a lot of trouble finishing books.
This book pulled me the whole way through when other books were failing to: It was in a setting, the Amish communities, that had always interested me.<p>I enjoyed <a href="https://www.jodipicoult.com/plain-truth.html">Plain Truth</a>
by Jodi Picoult. I finished it a couple of months ago, when I was
feeling very <a href="https://www.thecodedmessage.com/posts/patience/">restless and impatient</a> about
everything going on in my life. At the time, I desperately needed
fun books to read, but I was simultaneously having a lot of trouble
finishing books.</p>
<p>This book pulled me the whole way through when other books were
failing to: It was in a setting, the Amish communities, that had always
interested me. It was competent enough dealing with that community to
not drive me away. It made nuanced and smart enough points to keep me
engaged, without being so subtle or so sophisticated as to be too heavy
or dry or otherwise difficult to get through. All in all, the perfect
balance for where I was just then.</p>
<p>This book juxtaposes two concepts that people wouldn’t normally associate
with each other: the pacifistic, quaint, and well-respected Amish
community; and the trend of young unwed mothers murdering
their newborns, which was commonly discussed in the news at the time
the book was written and which the author has discussed as inspiration.</p>
<p>The general theme of the book was that Amish people are just people.
They’re not a monolith, and their culture, while it values conformity,
doesn’t erase individual differences or interpersonal tension. The book
managed to avoid the twin temptations of glorifying and fetishizing Amish
culture on the one hand, and degrading it as cultish or criticizing it
on the other. The differences are impactful but also nuanced and
they’re morally complex.</p>
<p>There were a couple of minor details that got me, nerd as I am.
Some of the Pennsylvania Dutch was misspelled, especially the
name of the language which is <em>Deitsch</em> not <em>Dietsch</em> (pronounced with
an “eye”-vowel). This made me laugh but didn’t detract from my
enjoyment too much – though it did make me more aware that the Amish
culture as depicted in the book was to a certain extent a fictional
culture inspired by actual Amish culture rather than a documentation
of it.</p>
<p>Another minor quibble: There was a scene where a judge wanted an Amish
witness to swear an oath and they had to negotiate the accommodation of
“affirmation” on the spot. “Affirmation” is a well-established accommodation
for people who don’t swear oaths for religious regions; it used to be
much more common, is in the US Constitution, and is talked about in
law school. The judge wouldn’t have had to have it explained to them
and the lawyer wouldn’t have had to come up with it on the spot.
I do concede, however, that how the book did it was more interesting.</p>
<p>Like all Jodi Picoult books, it came with a twist at the end. I shan’t
spoil it, but I will say that it was interesting, emotionally challenging,
and resonated well and contributed to the previously-established themes.</p>
<p>All in all, a read that I enjoyed and needed at the time!</p>
Another Confusing Haskell Error Messagehttps://www.thecodedmessage.com/posts/haskell-error-message-2/2022-06-17T00:00:00+00:00The Error Message I’ve written before about just how befuddling Haskell error messages can be, especially for beginners. And now, even though I have some professional Haskell development under my belt, I ran across a Haskell error message that confused me for a bit, where I had to get help. It’s clear to me now when I look at the error message what it’s trying to say, but I legitimately was stumped by it, and so, even though it’s embarrassing for me now, I feel the need to write about how this error message could have been easier to understand:<h1 id="the-error-message">The Error Message</h1>
<p>I’ve written before about just how <a href="https://www.thecodedmessage.com/posts/haskell-gripe/">befuddling</a>
Haskell error messages can be, especially for beginners. And now, even
though I have some professional Haskell development under my belt, I
ran across a Haskell error message that confused me for a bit, where I
had to get help. It’s clear to me now when I look at the error message
what it’s trying to say, but I legitimately was stumped by it, and so,
even though it’s embarrassing for me now, I feel the need to write about
how this error message could have been easier to understand:</p>
<pre tabindex="0"><code>frontend/src/Frontend/WordTiles.hs:87:25-45: error:
• Could not deduce (HasDomEvent t () 'ClickTag)
arising from a use of ‘domEvent’
from the context: (DomBuilder t m, PostBuild t m, MonadHold t m,
MonadFix m)
bound by the type signature for:
app :: forall t (m :: * -> *).
(DomBuilder t m, PostBuild t m, MonadHold t m, MonadFix m) =>
m ()
at frontend/src/Frontend/WordTiles.hs:(70,1)-(76,9)
• In the expression: domEvent Click submit
In an equation for ‘click’: click = domEvent Click submit
In the second argument of ‘($)’, namely
‘do inputText <- fmap value $ inputElement $ def
submit <- el "button" $ text "Submit"
let click = domEvent Click submit
pure $ current inputText <@ click’
|
87 | let click = domEvent Click submit
| ^^^^^^^^^^^^^^^^^^^^^
</code></pre><p>The code in question was in the Reflex FRP’s “widget” monad, defined
as usual by a number of monad typeclasses:</p>
<pre tabindex="0"><code>app
:: ( DomBuilder t m
, PostBuild t m
, MonadHold t m
, MonadFix m
)
=> m ()
app = do
let
start = Game [] wordSet "PIETY"
moveAll word (gm, _) = move word gm
rec
game <- foldDyn moveAll (start, []) newWord
gameDisplay game
newWord <- fmap (fmap T.unpack) $ el "div" $ do
inputText <- fmap value $ inputElement $ def
submit <- el "button" $ text "Submit"
let click = domEvent Click submit
pure $ current inputText <@ click
pure ()
</code></pre><h1 id="my-confusion">My Confusion</h1>
<p>Some of you might already see the problem, especially those who know
Reflex. But I didn’t see it. My brain saw <code>(HasDomEvent t () 'ClickTag)</code>
and completely misread it. I assumed it meant something like “with <code>t</code>
as the tag, we can get the DOM event as <code>'ClickTag</code>.” I assumed that the
<code>()</code> was irrelevant to understanding the type, indicating some sort of
optional type was not necessary to be provided.</p>
<p>I then tried to address this by adding <code>(HasDomEvent t () 'ClickTag)</code> to
the context of <code>app</code>:</p>
<pre tabindex="0"><code>app
:: ( DomBuilder t m
, PostBuild t m
, MonadHold t m
, MonadFix m
, HasDomEvent t () 'ClickTag
)
=> m ()
</code></pre><p>It wasn’t the issue.</p>
<p>I had hoped this wasn’t the issue, but I thought it might be, and I
had no idea what the issue actually was. Maybe we just needed to list
all the DOM events <code>t</code> can handle, I had thought. I should’ve noticed
it was <code>t</code> and not <code>m</code>, and I would expect <code>m</code> to be involved in such a
context. I should have read the thing out loud in my head, and realized
that it wasn’t <code>t</code> that didn’t have the DOM event of <code>'ClickTag</code>, but
<code>()</code>. But I didn’t. My eyes kind of glazed over at the complicated
typeclass expression. I just didn’t think.</p>
<h1 id="the-solution">The Solution</h1>
<p>The problem, a friend had to tell me, was nothing to do with <code>t</code>
and everything to do with <code>()</code>. <code>submit</code> was not, as I had thought,
a representation of the DOM element I had created with a button. To do
that, you need to call <code>el'</code>:</p>
<pre tabindex="0"><code>(submit, _) <- el' "button" $ text "Submit"
let click = domEvent Click submit
pure $ current inputText <@ click
</code></pre><p><code>submit</code>, gotten from <code>el</code>, was actually of type <code>()</code>. And, of course,
you can’t get any DOM event out of <code>()</code>, let alone a <code>Click</code>.</p>
<h1 id="better-error-messages">Better Error Messages</h1>
<p>But while I left this situation with take-aways for myself, to better
read Haskell error messages in the future, I was also frustrated
at the Haskell compiler, especially in comparison to the Rust compiler
I have gotten used to recently through my job.</p>
<h2 id="list-involved-types">List Involved Types</h2>
<p>How on earth did it not indicate at all that <code>(HasDomEvent t () 'ClickTag)</code>
was a problem with the type of <code>submit</code>? Sure, the constraint “arose”
from the type of <code>domEvent</code>, but <code>submit</code> is clearly an important value
involved in making the type not work.</p>
<p>This is easier to implement than a Haskell person might think. I
understand that it’s unclear which type “caused” the problem from a
human perspective. So why not list them all? Just a laundry list of
inferred types would’ve been helpful: I would have seen that <code>submit</code>
was of type <code>()</code>, and that would’ve helped me through the situation. Is
that too much to ask? Something like this:</p>
<pre tabindex="0"><code>Related types:
domEvent :: HasDomEvent t a => EventName en -> a -> Event t (EventResultType en)
Click :: EventName ClickTag
submit :: ()
</code></pre><p>Any two of those types would have given me the hint I needed. Really,
either <code>domEvent</code> or <code> submit</code> would have enabled me to figure it out.</p>
<h2 id="warn-about--bindings">Warn About <code>()</code> Bindings</h2>
<p>Similarly, how on earth was I allowed to write this line without a warning:</p>
<pre tabindex="0"><code>submit <- el "button" $ text "Submit"
</code></pre><p><code>submit</code> is invariately <code>()</code>. Shouldn’t binding a <code>()</code> value be at least
a warning? In what possible situation would you want to do that? I know
that situations exist, especially situations where a type is sometimes
<code>()</code>, but this type is invariably <code>()</code>, and I have <code>-Wall</code> turned on in
this project. I want warnings for things that there are occasionally
legitimate use cases for. Binding a name to <code>()</code>, especially when it’s
from a function call and not literally <code>let unit = ()</code>, has got to
be a mistake 99 times out of 100.</p>
<p>This is apparently not a warning in Rust either, and I am confused by that,
because Rust is normally better about its warnings:</p>
<pre tabindex="0"><code>fn foo() {
}
fn main() {
let x = foo(); // Compiles without warning!
drop(x);
}
</code></pre><p>I think it would be a reasonable and useful warning in both programming
languages. The opposite situation already provokes a warning in Haskell,
where you have an action in a <code>do</code>-block that returns a value and you
implicitly ignore it:</p>
<pre tabindex="0"><code>[jim@palatinate:~/Writing/TheCodedMessage/content/posts]$ ghci -Wall
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
Prelude> do { pure 'x'; pure () }
<interactive>:1:6: warning: [-Wunused-do-bind]
A do-notation statement discarded a result of type ‘Char’
Suppress this warning by saying ‘_ <- pure 'x'’
Prelude>
</code></pre><p>It only makes sense that the converse mistake, which is even more likely
to be a mistake, also have a warning.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Error messages are an extremely important part of a programming
language, both for adoption and for programmer efficiency. Part of
the point in working in a strongly-typed language with a sophisticated
type system, like Rust or Haskell, is supposed to be that we discover
most of our problems through compiler error messages, rather than
through runtime bugs. So most of our troubleshooting will happen at
compile time, grappling with these error message. This makes error
messages in Haskell more important than in the average programming
language, and makes the standard for good error messages even higher.
We can do better than the status quo, and we should.</p>
Command Line Interface UXes Need Love, Toohttps://www.thecodedmessage.com/posts/2022-06-16-programming---cli/2022-06-16T00:00:00+00:00It took me a long time to admit to myself that the venerable Unix command line interface is stuck in the past and in need of a refresh, but it was a formative moment in my development as a programmer when I finally did. Coming from that perspective, I am very glad that there is a new wave of enthusiasm (coming especially from the Rust community) to build new tools that are fixing some of the problems with this very old and established user-interface.<p>It took me a long time to admit to myself that the venerable Unix
command line interface is stuck in the past and in need of a refresh,
but it was a formative moment in my development as a programmer when I
finally did. Coming from that perspective, I am very glad that there is a new
wave of enthusiasm (coming especially from the Rust community) to build
new tools that are fixing some of the problems with this very old and
established user-interface.</p>
<h1 id="the-role-of-the-unix-cli-interface">The Role of the Unix CLI Interface</h1>
<p>To describe the Unix command line interface, “venerable” is definitely
the right word: many programmers (including myself at some points of
my life) have an awe of Unix and its role in computing history that has
sometimes bordered on veneration.</p>
<p>Since the <a href="https://en.wikipedia.org/wiki/Unix">Unix</a>
operating system began development at Bell Labs in 1969, it has gone
viral. That’s probably an understatement: Most modern operating
systems descend from this original Unix, either directly through
gradual code change (macOS and iOS are descended it from it through
<a href="https://en.wikipedia.org/wiki/Berkeley_Software_Distribution">BSD</a>),
or through <a href="https://en.wikipedia.org/wiki/Linux">Linux</a> (the
kernel behind most servers and behind Android and ChromeOS) and
its accompanying usermode software (much of which was part of the
<a href="https://www.gnu.org/">GNU</a> project), which were designed to work like
Unix due its familiarity for users and programmers.</p>
<p>Unix was and is billed not just as an operating system, but a
<a href="https://en.wikipedia.org/wiki/Unix_philosophy">philosophy</a>. Among other
things, its command line interface has been held up time and time again
as an example of good design practices and an ideal realization of this
philosophy, with its developer- and administrator-friendly orientation
towards plain text files and with its modularity, especially as embodied
in the concept of pipelining.</p>
<p>And as a result, when people say they know “the command line,”
it’s almost certainly the Unix command-line interface that
they’re talking about. And what’s more, many of us were taught
it from texts that gushed about how great it is. But even the
Unix command line interface, though part of a <a href="https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/">well-established
standard</a>,
the topic of many books, and used by and intimately familiar to millions
of programmers and admins across generations, is, in the end, just
another computer interface for users and developers. And it has its flaws.</p>
<h1 id="a-disappointing-ambiguity">A Disappointing Ambiguity</h1>
<p>As I alluded to before, when I was a much younger programmer, I had an
awe-struck veneration for Unix. One of my colleagues at an early job in my
career referred to me as our company’s “Unix philosopher.” While I wasn’t
sure whether he meant it as a compliment, at the time, I took it as one.</p>
<p>The first flaw that really got my attention in the Unix command line had
to do with the <code>mv</code> command. I’m going to take some time explaining this
flaw in detail, as it’s somewhat subtle, and as discovering it was
a formative moment for me in my development as a programmer.</p>
<p><code>mv</code>, as many of you know, is short for “move.” And while its job indeed
includes moving files from one place to another, due to idiosyncracies
of the Unix file system (if they can be called idiosyncracies when most
file systems followed Unix’s lead on this), moving files and renaming
files are closely related operations under the hood, causing the <code>mv</code>
command to be both the “move” command and the “rename” command:</p>
<pre tabindex="0"><code># Assume a file called 'draft-file'
# Assume a directory called 'final-docs'
# Rename 'draft-file' to 'final-file' and put it in 'final-docs'
mv draft-file final-file # rename 'draft-file' to 'final-file'
mv final-file final-docs # move 'final-file' into 'final-docs' directory
# Alternatively, one step:
mv draft-file final-docs/final-file
</code></pre><p>As you can see, there is no distinction between these operations.
There is no option that you must enable to get the “moving” feature as
opposed to the “renaming” feature. And this can result in surprises, which
are <a href="https://en.wikipedia.org/wiki/Principle_of_least_astonishment">bad</a>
in software development.</p>
<p>Consider this command again:</p>
<pre tabindex="0"><code>mv draft-file final-file
</code></pre><p>What does it do? It changes the name of the file from <code>draft-file</code>
to <code>final-file</code>, keeping it in the same directory, right? Well,
probably, and that’s almost certainly what the user intended, but
what if someone, accidentally or intentionally, had created a
directory called <code>final-file</code>? That command would be interpreted
instead as moving <code>draft-file</code> into the <code>final-file</code> directory:</p>
<pre tabindex="0"><code>$ # Rename operation
$ touch draft-file
$ ls
draft-file
$ mv draft-file final-file
$ ls
final-file
$ ls final-file
final-file
$ rm final-file
$
$ # Move operation
$ mkdir final-file # Imagine someone else did this, or it was done by accident
$ touch draft-file
$ mv draft-file final-file
$ ls
final-file
$ ls final-file
draft-file
$ rm final-file
rm: cannot remove 'final-file': Is a directory
$ rm -rf final-file
</code></pre><p>Notice that if there is no color-coding enabled, a simple <code>ls</code> command
doesn’t even distinguish the two situations, so you can’t tell which
one happened without issuing a more specific command, as <code>ls</code> also has
a dual role: it can either show you the names of the files you specify,
if they are present, or it can show you the files in a directory you
specify. The <code>-d</code> option disambiguates that you want the names and not
the contents, but the default is still ambiguous.</p>
<p>In the case of the <code>mv</code> command, this potentially could even be a security
vulnerability in a shell script (which is admittedly not a very secure
platform). It is in any case an unnecessary complication.</p>
<p>The GNU version of <code>mv</code> has a <code>-t</code> option to indicate that the destination
is not to be interpreted as a directory to put things in, and a <code>-T</code> option
to show unambiguous intent for a target directory to be used. But these
are extensions; the POSIX standard manual page for <code>mv</code> doesn’t mention them.</p>
<p>And while this GNU extension is helpful, especially in scripts that you
know will only be run with the GNU version of <code>mv</code> (that is, not on macOS),
I don’t think it goes far enough. Most people don’t know about them,
and the possibility of surprise is still there.</p>
<h1 id="disillusioned">Disillusioned</h1>
<p>When I realized this, it created a huge hole in my previous (admittedly
unreasonable) esteem for the Unix command line interface. I realized
that the ideal solution was something impractical, almost unthinkable
to the younger version of me: <code>mv</code> should be deprecated in favor of two
commands, one to do renaming, and one to do targeted directory-dropping.</p>
<p>This glitch in the <code>mv</code> command is just a gotcha to be aware of,
one of many minor flaws to dance around when shell scripting. But I
remember it strongly, because rather than being warned about it in a
book, I discovered it myself, and therefore it was the distinct moment I
realized that the command line interface would need to be improved at some
point. And once the metaphorical levee was broken, I started noticing
many inconveniences and problems in the traditional Unix CLI tools,
often more relevant to my day-to-day workflow than this minor gotcha.</p>
<p>I ultimately came to read more critical sources about Unix, such as the
famous <a href="https://web.mit.edu/~simsong/www/ugh.pdf">UNIX-HATERS Handbook</a>,
and similar sources that emphasized the problems. And I’m very
glad I went through this process, because before this, I was a naive
CLI user and shell-scripter, trusting the system way more than I should,
leaving myself open to serious problems.</p>
<p>Many Unix commands have gotchas and inconveniences, some I knew
about before this revelation and brushed aside, others that I
found out about later. <code>tar</code> has its <a href="https://xkcd.com/1168/">idiosyncratic traditional
syntax</a> that many, many scripts (and people)
still use, and inconsistency between platforms on whether you need
<code>-z</code> to unpack a compressed archive. The way the shell itself worked
also contained gotchas: What happens if you have files whose names
start with a <code>-</code>? (Answer: Their names get misinterpreted as options,
even if you didn’t type them but simply included them accidentally in
a wildcard expansion.)</p>
<p>Among the more practical issues that particularly effect me, I want
to emphasize two in particular: Why is <code>find</code>’s syntax so gnarly, so
that you have to type out <code>--name</code> and explicitly specify the current
directory? Why is it so hard to get <code>grep</code> to not display the pages-long
lines of minimized Javascript or similar files when I want to only
display the shorter lines from actual source files?</p>
<h1 id="the-future">The Future</h1>
<p>Luckily, improvement is on its way. For the last two cherry-picked
examples, there are new re-conceptions of <code>find</code> and <code>grep -r</code>
that fix them (with new names, of course, so they’re not beholden
to interface-compatibility), and I recommend them (dare I say such
blasphemy?) over the traditional equivalents:</p>
<ul>
<li><a href="https://github.com/sharkdp/fd"><code>fd-find</code></a></li>
<li><a href="https://github.com/BurntSushi/ripgrep"><code>ripgrep</code></a></li>
</ul>
<p>Don’t let their long names dissuade you; they are commonly installed
as <code>fd</code> and <code>rg</code>, respectively, and come with such modern features as:</p>
<ul>
<li>Normal command line syntax (<code>fd</code>)</li>
<li>Integration with <code>git</code>, the <em>de facto</em> standard version control system, by ignoring <code>.git</code> and <code>.gitignore</code>’d files by default (both)</li>
<li>Line length maximums (<code>rg</code>)</li>
<li>Modern leveraging of multithreading (both)</li>
<li>Better performance than their traditional counterparts</li>
</ul>
<p>These are the only new Rust-based commands I’ve tried, but they’ve
already vastly improved my workflow, so that I miss having them
(<code>fd</code> especially) when SSH’d into relatively minimalist embedded
devices. And I have reason to hope there’s more gems out there
as part of this explosive movement to implement <a href="https://gist.github.com/sts10/daadbc2f403bdffad1b6d33aff016c0a">new Rust-based
commands</a>.</p>
<p>Whether people are doing this to improve their Rust chops, or because
they’ve felt a need for a long time and Rust is just their PL of choice,
it’s good to see some actual evolution in my day-to-day experience
as a Unix CLI user. It hasn’t fixed <code>mv</code> – yet – but it’s good to
see it evolving.</p>
<p>On the implementation side of things, I am also very
happy to see a Rust project to <a href="https://github.com/uutils/coreutils">reimplement the standard
<code>coreutils</code></a>. The C implementations
undoubtedly leave some performance and stability on the table, and a
new implementation is long over-due. A fresh implementation of these
utilities will hopefully also spark improvements to the interfaces.</p>
<h2 id="and-meanwhile-in-git-land">And Meanwhile, in <code>git</code>-land</h2>
<p>On a related positive note, I learned very recently (in 2022)
that <code>git</code> has (in 2019) fixed a problem similar to <code>mv</code>s:
<code>git checkout</code>, ambiguous in a similar way, has been rendered
unnecessary by the less ambiguous <a href="https://tanzu.vmware.com/developer/blog/git-switch-and-restore-an-improved-user-experience/"><code>git switch</code> and <code>git restore</code></a>.</p>
Why I Won't Correct You're Grammar (unless you ask)https://www.thecodedmessage.com/posts/grammar/2022-06-14T00:00:00+00:00I am an Ivy League-educated professional who regularly has to write for my job, who was always in the top English classes in school. And sometimes, I mix up “your” and “you’re.”
I know how grammar works. I always, if I stop to think about it, can figure out which one to use. I know all the tricks. Most of the time, I don’t have to think about it, and the right one comes out.<p>I am an Ivy League-educated professional who regularly has to write
for my job, who was always in the top English classes in school. And
sometimes, I mix up “your” and “you’re.”</p>
<p>I know how grammar works. I always, if I stop to think about it, can
figure out which one to use. I know all the tricks. Most of the time, I
don’t have to think about it, and the right one comes out. But sometimes,
I’m just thinking in terms of what sounds I would make if I were speaking,
and I’m in a rush or just distracted or just glitching, and the wrong
one comes out.</p>
<p>What’s my point? My point is that written English conventions are hard
and unnatural, that even a very educated native speaker can mess them up,
even one who writes all the time. Not only are there sets of
homophonous grammar words which are super-easy to mess up –
such as “you’re” vs “your,” “they’re” vs “their” vs “there” – we also
have one of the few spelling systems complicated enough that using it
is a <a href="https://spellingbee.com/">competitive national sport</a>. If you
were trying to make a language hard to write in on purpose, I’m not sure
you’d do better than English.</p>
<p>If written English is basically impossible to get right all the time,
even for the most educated native speakers, what about everyone else?
Most people are not Ivy League-educated native English speakers. A lot
of people learn English in adulthood, or at least later in childhood. A
lot of people grow up speaking non-standard dialects. A lot of people
simply don’t get the educational opportunities I have had – or
simply choose to focus on other skills in life.</p>
<p>So, I say, let’s not use “your”/“you’re” as a value judgment or a sign
of stupidity. Obviously if your friend gives you something to copy-edit
and they use the wrong one, fix it, but especially in informal settings
like social media and text, let’s maybe not make it out to be a bigger
deal than it is?</p>
<h1 id="descriptivism-and-prescriptivism">Descriptivism and Prescriptivism</h1>
<p>Is this the dreaded “descriptivism”? Perhaps it is, at least
in the sense in which that term is bandied about in popular culture.</p>
<p>In this dramaticized popular conceptualization, descriptivists and
prescriptivists are different camps opposing each other, aligned
with our larger societal cultural war:</p>
<ul>
<li>
<p><strong>Descriptivists:</strong> Made up of linguistics professors and self-appointed
activists, the descriptivists align with the culture-war <em>liberals</em>,
upholding <em>diversity</em>. They think every form of speech and grammar is
equally valid, especially those of underprivileged communities.</p>
</li>
<li>
<p><strong>Prescriptivists:</strong> Made up of English teachers and self-appointed
grammarians, the prescriptivists align with culture-war <em>conservatives</em>,
upholding <em>tradition</em>. They believe that there is one true system of
English grammar, that everyone should aspire to, be taught, and be
socially pressured into adhering to.</p>
</li>
</ul>
<p>As you may have guessed from my descriptions, I think this way
of looking at the issue is silly. Rather than dividing people into
camps, I think there is a more <em>descriptive</em> way of using these
terms. I instead would <em>prescribe</em> definitions that focus on attitudes
that any person can adopt:</p>
<ul>
<li>
<p><strong>Descriptiv<em>ism</em></strong>: the attitude of science. If we are acting as
linguists, as <em>Sprachwissenschaftler</em> or “language scientists,” then
we want to study the amazing fact that humans naturally develop and
perpetuate intricate systems for turning sounds into words into
sentences. For this goal, all dialects (and sociolects and idiolects)
are equally valid, because all of them can teach us more about how
language works.</p>
</li>
<li>
<p><strong>Prescriptiv<em>ism</em></strong>: the attitude of conventionality. If we are acting
as professional writers or speakers or copy-editors, then we want
to make sure that we and those we work with can communicate in
a way that is comprehensible by our audience, follows the rules
of grammar and spelling and punctuation that our audience expects,
so that writers and speakers can signal that they take the situation
appropriately seriously and so that grammar doesn’t distract from
communication. For these goals, what is “valid” (or better put, appropriate)
is often the standard, conventionalized, and academic prestige forms
of a language.</p>
</li>
</ul>
<p>With these definitions in mind, it is possible for one person to take
on different stances in different contexts. An English teacher can use
descriptivism when they want to understand why their students speak and
write a certain way, or struggle with conventions in comparison to their
peers: Is it a problem grasping the concepts, or is it because they
speak a different dialect or sociolect from the the other students in
their class? In the same class, they can take on a more prescriptivist
stance when they set their goals and standards for how the students
should ultimately learn to write and speak in formal situations.</p>
<p>Linguistics researchers often study more stigmatized dialects or common
non-standardisms in speech, and dispel stereotypes about them that are
not based in fact. They then write the resultant papers in immaculate
academic formal language. This, of course, makes no sense if you think of
prescriptivism and descriptivism as camps, but there is no contradiction
here. It makes perfect sense if you instead think of prescriptivism and
descriptivism as attitudes, appropriate in different situations.</p>
<p>But my point in this article is not about science, teaching, or
professional communication. My point is about what stance to take
in everyday communication. And in everyday communication, neither
prescriptivism nor descriptivism is appropriate. <a href="https://xkcd.com/1576/">Unwanted grammar
corrections</a> are rude, and so is <a href="https://xkcd.com/2390/">unwanted field
linguistics</a>. The appropriate stance to take
in everyday communication is neither descriptivism nor prescriptivism,
but rather politeness.</p>
<p>Sure, descriptivism is sometimes used as an argument here. It can be
scientifically demonstrated that informal forms of English and even
sentences like “It don’t do nothing,” in the dialects where they arise,
have their own internal logic, just as sophisticated from an objective
standpoint as more prestigious forms of English – even though that
has very little to do with the “your”/“you’re” distinction and purely
written distinctions like it. Many long-established shibboleths of
the ostentatiously grammar-conscious can be shown to have little basis
in history or established usage – but “your”/“you’re” is a pretty
well-established distinction. We could even find scientific studies to
show how much more difficult the English writing system is from those
of other languages, but even if it wasn’t, the rules of politeness
wouldn’t change.</p>
<p>Science and research may be useful for answering questions like
“how can we most effectively teach children?” But it doesn’t really
have any bearing on whether we should give people unwanted grammar
corrections or use their grammar to judge them. In this, common
sense and politeness win the day, and they simply say: “No.” Or
perhaps rather: “Please don’t.”</p>
<h1 id="language-decay">Language “Decay”</h1>
<p>There is one objection that I think is worth addressing, that
comes from a place other than raw snobbery. It goes something like this:</p>
<blockquote>
<p>But Jimmy! What if this social pressure serves a good purpose? What if
it prevents language change, allowing our language to stay as it
is for longer, connecting us with writers of the past? What if we
really like the way English is and don’t want it to change, for
aesthetic or culture reasons, or the belief that the language is in some
way particularly well-suited for use as it is?</p>
</blockquote>
<p>This objection (phrased differently) was raised when I posted an earlier,
shorter version of this essay to Facebook. And I’m not sure what to
do with it. I suspect some people think that this is a worthy goal,
an upside of grammar corrections to be balanced against the politeness
elements. Others, I imagine, see it as a silver lining or a subsidiary
purpose to our prescriptions and our societal elevation of relatively
conservative conventions and grammar norms.</p>
<p>For me, there are two considerations here. One is whether this works at all.
Of course, in the long term, it is futile; English will change eventually,
slowly but surely,
as it
<a href="https://internetshakespeare.uvic.ca/doc/Son_Q1/section/Sonnet%2016~Sonnet%2018/index.html">has</a>
<a href="https://www.poetryfoundation.org/poems/43521/beowulf-old-english-version">before</a>,
whatever we do. But maybe prescriptive grammar slows down language
change.</p>
<p>Does it? Probably, but I think people overestimate the effect. And I
think the effect is almost entirely accomplished in those situations where
prescriptivism is appropriate, namely, copy-editing and education. I
simply don’t think people correcting their acquaintances’ text messages
for them or judging their grammar online does much to keep the language
from changing.</p>
<p>But even if judgmentalism (and fear of judgmentalism) does slow down
language change, why does it matter? English as it is now isn’t that
special. It’s just a language, like any other. Whatever it evolves into
will suit humanity and society’s purposes equally well. Heroic efforts
to try and stop the inevitable is not certainly not worth the rudeness
that often comes along with them.</p>
<p>I personally am not attached to the current form of English, but there
is a way in which I can relate. I do remember being upset at some of
the changes that are happening (very slowly) in the German language,
but then I was comforted by something: Unless something drastically
increases my life-span, German as I’ve learned it will remain a valid
and prestigious way to speak German (modulo my mistakes and accent),
for the rest of my days.</p>
<p>Perhaps my children’s children will live in a world where German is
drastically and unrecoverably different – if I even have children who
then have children – but that will be their problem, and I’m sure their
opinions will differ from mine.</p>
Trivia About Rust Types: An (Authorized) Transcription of Jon Gjengset's Twitter Threadhttps://www.thecodedmessage.com/posts/trivia-rust-types/2022-06-06T00:00:00+00:00Preface (by Jimmy Hartzell) I am a huge fan of Jon Gjengset’s Rust for Rustaceans, an excellent book to bridge the gap between beginner Rust programming skills and becoming a fully-functional member of the Rust community. He’s famous for his YouTube channel as well; I’ve heard good things about it (watching video instruction isn’t really my thing personally). I have also greatly enjoyed his Twitter feed, and especially have enjoyed the thread surrounding this tweet:<h1 id="preface-by-jimmy-hartzell">Preface (by Jimmy Hartzell)</h1>
<p>I am a huge fan of Jon Gjengset’s <a href="https://nostarch.com/rust-rustaceans">Rust for
Rustaceans</a>,
an excellent book to bridge the gap between beginner
Rust programming skills and becoming a fully-functional
member of the Rust community. He’s famous for his <a href="https://www.youtube.com/playlist?app=desktop&list=PLqbS7AVVErFiWDOAVrPt7aYmnuuOLYvOa">YouTube
channel</a>
as well; I’ve heard good things about it (watching video
instruction isn’t really my thing personally). I have also
greatly enjoyed his <a href="https://twitter.com/jonhoo">Twitter feed</a>,
and especially have enjoyed the thread surrounding <a href="https://twitter.com/jonhoo/status/1532761983606411264">this
tweet</a>:</p>
<blockquote>
<p>Okay, learning time! Name a @rustlang
type (can be generic), and I’ll (try to) tell you something you didn’t
know about that type!</p>
</blockquote>
<p>What great fun!</p>
<p>I immediately felt that this thread should have a
transcription outside of social media (Jon Gjengset already did a <a href="https://www.reddit.com/r/rust/comments/v4dnaj/twitter_thread_with_trivia_about_rust_types/">Reddit
transcription</a>),
and so I asked him if he had any plans to turn
it into a blog post, and failing that, whether <a href="https://twitter.com/thecodedmessage/status/1533080530513801219">I
could</a>.
Much to my surprise, <a href="https://twitter.com/jonhoo/status/1533139200811290625">he gave me the
go-ahead</a>.</p>
<p>So I have done so, and this is the blog post! It wasn’t even boring,
because I learned so much as I copied the entries! Minor edits have
been made to add formatting and adapt links to how blogs work rather
than how Twitter works. This is taken from the Reddit version. My
<a href="https://www.thecodedmessage.com/2022-06-06-programming---trivia-rust-types.md">markdown source</a>
is also available.</p>
<p>So, without further ado, Jon Gjengset’s “Trivia About Rust Types.”</p>
<h1 id="trivia-about-rust-types-by-jon-gjengset">Trivia About Rust Types (by Jon Gjengset)</h1>
<h2 id="stdfmtdebug"><code>std::fmt::Debug</code></h2>
<p>Did you know that the Formatter argument to <code>Debug::fmt</code> makes it really easy to customize debug representations for structs, enums, lists, and sets? See the <code>debug_*</code> methods on it.</p>
<h2 id="formatter"><code>Formatter</code></h2>
<p>Did you know that <code>std::fmt::Formatter</code> is super easy to use if you want more control over debugging for a custom type? For example, to emit a “list-like” type, just <code>Formatter::debug_list().entries(self.0.iter()).finish()</code>.</p>
<h2 id="optiont"><code>Option<T></code></h2>
<p>Did you know that <code>Option<T></code> implements <code>IntoIterator</code> yielding 0/1
elements, and you can then call <code>Iterator::flatten</code> to make that be 0/n
elements if T: IntoIterator?</p>
<h2 id="type-emptytuplelist--vec"><code>type EmptyTupleList = Vec<()></code></h2>
<p>Did you know that since <code>()</code> is a zero-sized type, and the vector never actually has to store any data, the capacity of <code>Vec<()></code> is <code>usize::MAX</code>!</p>
<h2 id="t"><code>T</code></h2>
<p>Did you know that <code>T</code> doesn’t imply ownership? When we say a type is
generic over <code>T</code>, that <code>T</code> can just as easily be a reference to something
on the stack, and the type system will still be happy. Even <code>T: 'static</code>
doesn’t imply owned — consider <code>&'static str</code> for example.</p>
<p>[Reminds me of this <a href="https://github.com/pretzelhammer/rust-blog/blob/master/posts/common-rust-lifetime-misconceptions.md">excellent article</a> -Jimmy]</p>
<h2 id="stdsyncmpscchannelsender"><code>std::sync::mpsc::channel::Sender</code></h2>
<p>Did you know that <code>std::sync::mpsc</code> has had a <a href="https://github.com/rust-lang/rust/issues/39364">known bug since
2017</a>, and that the
implementation may actually be replaced entirely with the crossbeam
channel implementation? <a href="https://github.com/rust-lang/rust/pull/93563">https://github.com/rust-lang/rust/pull/93563</a></p>
<h2 id="u128"><code>u128</code></h2>
<p>Did you know that even though we got <code>u128</code> a long time ago now, we
still don’t have <code>repr(128)</code>? <a href="https://github.com/rust-lang/rust/issues/56071">https://github.com/rust-lang/rust/issues/56071</a></p>
<h2 id="stdffiosstring"><code>std::ffi::OsString</code></h2>
<p>Did you know that there are per-platform extension traits
for <code>OsString</code> that bake in the assumptions you can safely
make on that platform? Such as <a href="https://doc.rust-lang.org/std/os/unix/ffi/trait.OsStringExt.html">strings being <code>[u8]</code> on
Unix</a>
and <a href="https://doc.rust-lang.org/std/os/windows/ffi/trait.OsStringExt.html">UTF-16 on
Windows</a>.</p>
<h2 id="stdptrnonnull"><code>std::ptr::NonNull</code></h2>
<p>Did you know that one of the super neat features of <code>NonNull</code> is that
it enables the same niche optimization that regular references and the
<code>NonZero*</code> types get where <code>Option<NonNull<T>></code> is the same size as <code>*mut T</code>?</p>
<h2 id="cowt"><code>Cow<T></code></h2>
<p>Did you know that there used to be a special
<code>IntoCow</code> trait, but it was deprecated before 1.0 was
released! <a href="https://github.com/rust-lang/rust/issues/27735">https://github.com/rust-lang/rust/issues/27735</a></p>
<h2 id="boxt"><code>Box<T></code></h2>
<p>Did you know that <code>Box<T></code> is a <code>#[fundamental]</code> type, which means that
it’s exempt from the normal rules that don’t allow you to implement
foreign traits for foreign types (assuming T is a local type)?</p>
<h2 id="stdprocesschild"><code>std::process::Child</code></h2>
<p>Did you know that <code>std</code> has
<a href="https://github.com/rust-lang/rust/blob/master/library/std/src/sys/unix/process/process_unix.rs">three different ways</a>
to spawn a
child process on Linux (<code>posix_spawn</code>, <code>clone3</code>/<code>exec</code>, <code>fork</code>/<code>exec</code>)
depending on what capabilities your kernel version has?</p>
<h2 id="pint"><code>Pin<T></code></h2>
<p>Did you know that the name <code>Pin</code> (and the name <code>Unpin</code>) where
both heavily debated? Pin was almost called Pinned, for example. <a href="https://github.com/rust-lang/rust/issues/55766#issuecomment-438789462">The
discussion</a>
is an interesting read now after the fact.</p>
<h2 id="vect"><code>Vec<T></code></h2>
<p>Did you know that <code>Vec::swap_remove</code> is way faster than <code>Vec::remove</code>
if you can tolerate changes to ordering?</p>
<p>Did you know that the smallest non-zero
capacity for a <code>Vec<T></code> <a href="https://github.com/rust-lang/rust/blob/9a74608543d499bcc7dd505e195e8bfab9447315/library/alloc/src/raw_vec.rs#L106-L110">depends on the size of
<code>T</code></a>?</p>
<h2 id="cstr"><code>CStr</code></h2>
<p>Did you know that <code>CStr::default</code> creates a <code>CStr</code> that points to a
const string <code>"\0"</code> stored in the binary text segment, which means all
default <code>CStr</code>s point to the same (non-null) string!</p>
<h2 id="fora-sometraita"><code>for<'a> SomeTrait<'a></code></h2>
<p>Did you know that you can use <code>for<'a></code> to say that a bound has to hold
for any lifetime <code>'a</code>, not just a specific lifetime you happen to have
available at the time. For example, <code><T> for<'a>: &'a T: Read</code> says that
any shared reference to a <code>T</code> must implement <code>Read</code>.</p>
<h2 id="this-monstrous-warp-typehttpsgistgithubusercontentcomfasterthanlimede0955a8b29d0d66110983ebb5fae442raw1827a3afbca01cd42eafd0905cfdc451da805cb7gistfile1txt"><a href="https://gist.githubusercontent.com/fasterthanlime/de0955a8b29d0d66110983ebb5fae442/raw/1827a3afbca01cd42eafd0905cfdc451da805cb7/gistfile1.txt">This monstrous warp type</a></h2>
<p>Did you know that the trailing commas you see in some places in there,
<code>,)</code>, are to <a href="https://doc.rust-lang.org/nightly/reference/expressions/tuple-expr.html">distinguish one-element tuples from regular parenthetical
expressions</a>?</p>
<h2 id="fnonce"><code>FnOnce</code></h2>
<p>Did you know that until Rust 1.35, you couldn’t call a <code>Box<dyn FnOnce></code> and needed a special type (<code>FnBox</code>) for it! This was
because it requires “unsized rvalues” to implement, which are still
unstable today. <a href="https://github.com/rust-lang/rust/issues/28796">https://github.com/rust-lang/rust/issues/28796</a> +
<a href="https://github.com/rust-lang/rust/issues/48055">https://github.com/rust-lang/rust/issues/48055</a></p>
<h2 id="f32"><code>f32</code></h2>
<p>Did you know that in Rust 1.62 we’ll get a deterministic ordering function
for floating point numbers? <a href="https://github.com/rust-lang/rust/pull/95431">https://github.com/rust-lang/rust/pull/95431</a></p>
<h2 id="arct"><code>Arc<T></code></h2>
<p>Did you know that <code>Arc</code> has a <code>make_mut</code> method that effectively gives
you copy-on-write? Given a <code>&mut Arc<T></code>, it will either give you <code>&mut T</code> if there are no other Arcs, or it will clone <code>T</code>, make the <code>Arc<T></code>
point to that new <code>T</code>, and then give you a <code>&mut</code> to it!</p>
<h2 id="heading"><code>!</code></h2>
<p>Did you know that <code>std::convert::Infallible</code> is the “original” <code>!</code>, and that
the plan is to one day replace <code>Infallible</code> with a type alias for <code>!</code>?</p>
<h2 id="fn"><code>fn</code></h2>
<p>Specifically, did you know that the name of a function is not an
<code>fn</code>? It’s a <code>FnDef</code>, which can then be
<a href="https://github.com/rust-lang/rust/issues/86654#issuecomment-869173835">coerced to a <code>FnPtr</code></a>?</p>
<h2 id="phantomdata"><code>PhantomData</code></h2>
<p>Did you know that it’s actually kind of tricky to define <code>PhantomData</code> yourself: <a href="https://github.com/dtolnay/ghost">https://github.com/dtolnay/ghost</a></p>
<h2 id="u32"><code>u32</code></h2>
<p>Did you know that <code>u32</code> now has associated constants for <code>MIN</code> and <code>MAX</code>,
so you no longer need to use <code>std::u32::MIN</code> and can use <code>u32::MIN</code>
directly instead?</p>
<h2 id="bool"><code>bool</code></h2>
<p>Did you know that bool isn’t just “stored as a byte”, the compiler
straight up declares its representation as <a href="https://github.com/rust-lang/rust/blob/master/compiler/rustc_middle/src/ty/layout.rs#L676-L682">the same as that of
u8</a>?</p>
<h2 id="any"><code>Any</code></h2>
<p>Did you know that <code>Any</code> is <em>really</em> non-magical? It just has a blanket
implementation for all <code>T</code> that returns <code>TypeId::of::<T>()</code>, and to
downcast it simply compares the return value of that trait method to
see if it’s safe to cast to downcast to a type! <code>TypeId</code> is magic though.</p>
<h2 id="self"><code>Self</code></h2>
<p>Did you know that <code>fn foo(self)</code> is syntactic sugar for
<code>fn foo(self: Self)</code>, and that one day you’ll be able to use
other types for <code>self</code> that involve <code>Self</code>, like <code>fn foo(self: Arc<Self>)</code>?
<a href="https://github.com/rust-lang/rust/issues/44874">https://github.com/rust-lang/rust/issues/44874</a></p>
<h2 id="heading-1"><code>()</code></h2>
<p>Did you know that <code>()</code> implements FromIterator, so you can
<code>.collect::<Result<(), E>></code> to just see if anything in an iterator erred?</p>
<p>[Note that this doesn’t say whether or not this is a good idea. -Jimmy]</p>
<h2 id="struct-s"><code>struct S</code></h2>
<p>Did you know that <code>struct S</code> implicitly declares a constant called <code>S</code>,
which is why you can make one using just <code>S</code>?</p>
<h2 id="refcell"><code>RefCell</code></h2>
<p>Did you know that RefCell allows you to replace a
value in-place directly (like <code>std::mem::replace</code>)?
<a href="https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.replace">https://doc.rust-lang.org/std/cell/struct.RefCell.html#method.replace</a></p>
<h2 id="corenumwrapping"><code>core::num::Wrapping</code></h2>
<p>Did you know that there used to also be a trait accompanying <code>Wrapping</code>,
<code>WrappingOps</code>, that was removed last minute before
1.0? <a href="https://github.com/rust-lang/rust/pull/23549">https://github.com/rust-lang/rust/pull/23549</a></p>
<h2 id="const-t"><code>*const T</code></h2>
<p>Did you know that, at least for the time being,
<code>*const T</code> and <code>*mut T</code> are more or less
equivalent? <a href="https://github.com/rust-lang/unsafe-code-guidelines/issues/257">https://github.com/rust-lang/unsafe-code-guidelines/issues/257</a></p>
<h2 id="stdosunixnetunixstream"><code>std::os::unix::net::UnixStream</code></h2>
<p>Did you know that (on nightly) you can pass UNIX file
descriptors over UnixStreams too, and thereby <a href="https://doc.rust-lang.org/std/os/unix/net/struct.SocketAncillary.html#method.add_fds">give another process
access</a>
to a file it may not otherwise be able to open?</p>
<h2 id="stdsynccondvarmutex"><code>std::sync::Condvar</code>/<code>Mutex</code></h2>
<p>Did you know that Mara is doing some awesome work on making
<code>Condvar</code> (and <code>Mutex</code> and <code>RwLock</code>) much better on a wide array on
platforms? <a href="https://github.com/rust-lang/rust/issues/93740">https://github.com/rust-lang/rust/issues/93740</a></p>
<h2 id="stdtaskwaker"><code>std::task::Waker</code></h2>
<p>Did you know that <code>Waker</code> is secretly just a <code>dyn std::task::Wake + Clone</code> done in a way that doesn’t require a
wide pointer or support for multi-trait dynamic dispatch? See
<a href="https://doc.rust-lang.org/std/task/struct.RawWakerVTable.html">https://doc.rust-lang.org/std/task/struct.RawWakerVTable.html</a></p>
<h2 id="impl-trait"><code>impl Trait</code></h2>
<p>Did you know that <code>impl Trait</code> in argument position and
<code>impl Trait</code> in return position represent completely
different type constructs, even though they “feel”
related? <a href="https://doc.rust-lang.org/nightly/reference/types/impl-trait.html">https://doc.rust-lang.org/nightly/reference/types/impl-trait.html</a></p>
<h2 id="btreemapk-v"><code>BTreeMap<K, V></code></h2>
<p>Did you know that <code>BTreeMap</code> is one of the few
collections that still doesn’t have a <code>drain</code>
method? <a href="https://github.com/rust-lang/rust/issues/81074">https://github.com/rust-lang/rust/issues/81074</a></p>
<h2 id="struct-invariantlifetimeidphantomdatamut-id-"><code>struct InvariantLifetime<'id>(PhantomData<*mut &'id ()>);</code></h2>
<p>Did you know that <code>PhantomData<T></code> has variance like <code>T</code>, and <code>*mut T</code>
is invariant over <code>T</code>, and so by placing a lifetime inside <code>T</code> you make
the outer type invariant over that lifetime?</p>
<h2 id="rct"><code>Rc<T></code></h2>
<p>Did you know that the <code>Rc</code> type was among the arguments
for why <code>std::mem::forget</code> shouldn’t be marked as
unsafe? <a href="https://github.com/rust-lang/rust/issues/24456">https://github.com/rust-lang/rust/issues/24456</a></p>
<h2 id="stdfutureready"><code>std::future::Ready</code></h2>
<p>Did you know that these days you can just use <code>async move { x }</code> instead
of <code>future::ready(x)</code>. The main reason to still use <code>future::ready(x)</code>
is that you can name the future it returns, which is harder with <code>async</code>
(without <code>type_alias_impl_trait</code> that is).</p>
<h2 id="usize"><code>usize</code></h2>
<p>Did you know that <code>usize</code> isn’t really “the size of a pointer”. Instead,
it’s more like “the size of a pointer address difference”, and the two
can be fairly different! <a href="https://github.com/rust-lang/rust/issues/95228">https://github.com/rust-lang/rust/issues/95228</a></p>
<h2 id="stdthreadthread"><code>std::thread::Thread</code></h2>
<p>Did you know that the <code>ThreadId</code> that’s available for each <code>Thread</code> is
entirely a <code>std</code> construct? Creating a <code>ThreadId</code> simply increments a global
static counter under a lock.</p>
<h2 id="stdopscontrolflow"><code>std::ops::ControlFlow</code></h2>
<p>Did you know that <code>ControlFlow</code> is really a stepping stone towards making
<code>?</code> work for other types than <code>Option</code> and <code>Result</code>? The full design has gone
through a lot of iterations, but the latest and greatest is
<a href="https://github.com/rust-lang/rust/issues/84277">RFC3058</a>.</p>
<h2 id="file"><code>File</code></h2>
<p>Did you know that there are implementations of <code>Read</code>, <code>Write</code>, and <code>Seek</code>
for <code>&File</code> as well, so multiple threads can share a single <code>File</code> and call
those concurrently. Whether they should is a different question of course.</p>
<h2 id="resultt-e"><code>Result<T, E></code></h2>
<p>Did you know that Rust originally (pre-1.0) had both Result and an Either type? They decided to remove Either <a href="https://github.com/rust-lang/rust/issues/9157">way back in 2013</a></p>
<h2 id="cowstr"><code>Cow<str></code></h2>
<p>Did you know that because <code>Cow<'a, T></code> is covariant in <code>'a</code>, you can always
assign <code>Cow::Borrowed("some string")</code> to one no matter what it originally
held?</p>
<h2 id="panicinfo"><code>PanicInfo</code></h2>
<p>Did you know that since <code>PanicInfo</code> is in core, its <code>Display</code>
implementation cannot access the panic data if it’s a <code>String</code> (since
it can’t name that type), so trying to print the <code>PanicInfo</code> after
a <code>std::panic::panic_any(format!("x y z"))</code> won’t print <code>"x y z"</code>?
<a href="https://github.com/rust-lang/rust/blob/352e621368c31d7b4a6362e081586cdb931ba020/library/core/src/panic/panic_info.rs#L159-L162">Source link.</a></p>
<h2 id="stdffic_void"><code>std::ffi::c_void</code></h2>
<p>Did you know that the whole <code>c_void</code> type is a collection
of hacks to try to work around the lack for extern
types? <a href="https://github.com/rust-lang/rust/issues/43467">https://github.com/rust-lang/rust/issues/43467</a></p>
<h2 id="featureraw_ref_op-raw-const-t"><code>#[feature(raw_ref_op)] &raw const T</code></h2>
<p>Definitely cheating :p But did you know that originally the intention
was to have <code>&const raw</code> variable be just a MIR construct and let
<code>&variable as *const _</code> be automatically changed to <code>&const raw</code>?
<a href="https://github.com/RalfJung/rfcs/blob/fd4b4cd769300cfde5d54865d227990b71b762d1/text/0000-raw-reference-operator.md">https://github.com/RalfJung/rfcs/blob/fd4b4cd769300cfde5d54865d227990b71b762d1/text/0000-raw-reference-operator.md</a></p>
<h2 id="u256"><code>u256</code></h2>
<p>Did you know that because Rust compiles through LLVM,
we’re sort of constrained to the primitive types
LLVM supports, and <a href="https://llvm.org/doxygen/classllvm_1_1Type.html#pub-static-methods">LLVM itself only goes up to
128</a>?</p>
<h2 id="_"><code>_</code></h2>
<p>Did you know that whether or not <code>let _ = x</code> should move <code>x</code> is actually
fairly subtle? <a href="https://github.com/rust-lang/rust/issues/10488">https://github.com/rust-lang/rust/issues/10488</a></p>
<h2 id="maybeuninit"><code>MaybeUninit</code></h2>
<p>Did you know that <code>MaybeUninit</code> arose because the previous mechanism,
<code>std::mem::uninitialized</code>, produced immediate undefined behavior when
invoked with most types (like <code>uninitialized::<bool>()</code>).</p>
<h2 id="struct-tconst-c-usize"><code>struct T<const C: usize></code></h2>
<p>Did you know that with Rust 1.59.0 you can now
<a href="https://blog.rust-lang.org/2022/02/24/Rust-1.59.0.html#const-generics-defaults-and-interleaving">give <code>C</code> a default value</a>?</p>
<h2 id="weakt"><code>Weak<T></code></h2>
<p>Did you know that actual deallocation logic for <code>Arc<T></code> is
implemented in <code>Weak<T></code>, and is invoked by considering all copies of
a particular <code>Arc<T></code> to collectively hold a single <code>Weak<T></code> between them?
<a href="https://github.com/rust-lang/rust/blob/7e9b92cb43a489b34e2bcb8d21f36198e02eedbc/library/alloc/src/sync.rs#L1108-L1109">Source link.</a></p>
<h2 id="t-n"><code>[T; N]</code></h2>
<p>Did you know that while <em>most</em> trait implementations for arrays now use
const generics to impl for any length <code>N</code>, <a href="https://github.com/rust-lang/rust/blob/e40d5e83dc133d093c22c7ff016b10daa4f40dcf/library/core/src/array/mod.rs#L371-L394">we can’t <em>yet</em> do the same for
<code>Default</code></a>.</p>
<h2 id="u8"><code>u8</code></h2>
<p>Did you know that as of Rust 1.60, you can now use <code>u8::escape_ascii</code> to
<a href="https://doc.rust-lang.org/std/ascii/fn.escape_default.html">get an iterator of the bytes needed to escape that byte character in
most contexts</a>.</p>
<h2 id="hashmapk-v"><code>HashMap<K, V></code></h2>
<p>Did you know that the Rust devs are working on a “raw” entry API for
<code>HashMap</code> that allows you to (unsafely) avoid re-hashing a key you’ve
already hashed? <a href="https://github.com/rust-lang/rust/issues/56167">https://github.com/rust-lang/rust/issues/56167</a></p>
<h2 id="mut-t"><code>&mut T</code></h2>
<p>Did you know that while <code>&mut T</code> is defined as meaning “mutable reference” in the Rust reference, you’re often better off thinking of it as “mutually exclusive reference”. <a href="https://docs.rs/dtolnay/latest/dtolnay/macro._02__reference_types.html">Quoth David Tolnay</a>.</p>
<h2 id="stdopsrange"><code>std::ops::Range</code></h2>
<p>Did you know that there’s been a lot of debate around whether or not the
<code>Range</code> types should be <code>Copy</code>? <a href="https://github.com/rust-lang/rust/pull/21846">https://github.com/rust-lang/rust/pull/21846</a></p>
<h2 id="atomicu32"><code>AtomicU32</code></h2>
<p>Did you know that you’ll often want <code>compare_exchange_weak</code>
over <code>compare_exchange</code> to get
<a href="https://devblogs.microsoft.com/oldnewthing/20180329-00/?p=98375">more efficient code on ARM cores</a>.</p>
<h2 id="stdopshash"><code>std::ops::Hash</code></h2>
<p>Did you know that Hash is responsible for not just
<a href="https://github.com/rust-lang/rust/issues/29263">one</a>
, but
<a href="https://github.com/rust-lang/rust/issues/65744">two</a>
of the issues on the “rust 2 breakage wishlist”?</p>
<h2 id="integer"><code>{integer}</code></h2>
<p>Did you know that fasterthanlime’s <a href="https://fasterthanli.me/articles/the-curse-of-strong-typing#different-kinds-of-numbers">most recent
article</a>
does a great job at explaining <code>{integer}</code>?</p>
<h2 id="fn-1"><code>Fn</code></h2>
<p>Did you know that until Rust 1.35.0, <code>Box<T> where T: Fn</code>
did not <code>impl Fn</code>, so you couldn’t (easily) call boxed
closures! <a href="https://github.com/rust-lang/rust/pull/55431">https://github.com/rust-lang/rust/pull/55431</a></p>
<h2 id="-"><code>((), ())</code></h2>
<p>Did you know that <code>((), ())</code> and <code>()</code> have the same hash?
<a href="https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=894b78e8ee2721440aa8dea5e35f9dc3">Playground link.</a></p>
<h2 id="t-1"><code>[T]</code></h2>
<p>Did you know that <code>&[u8]</code> implements <code>Read</code> and <code>Write</code>? So for anything
that takes <code>impl Read</code>, you can provide <code>&mut</code> slice instead! Comes in
handy for testing. Note that the slice itself is shortened for each read,
hence <code>&mut &[u8]</code>.</p>
<h2 id="heading-2"><code>*</code></h2>
<p>Did you know that <code>*</code> is (mostly) just syntax sugar for the std::ops::Mul
trait?</p>
<h2 id="unsafecellt"><code>UnsafeCell<T></code></h2>
<p>Did you know that <code>UnsafeCell</code> is one of those types that the compiler
needs “special magic” for because it has to instruct LLVM to not assume
Rust’s normal aliasing rules hold once code traverses the boundary of any
<code>UnsafeCell</code>?</p>
Function Overloading in Rusthttps://www.thecodedmessage.com/posts/function-overloading-in-rust/2022-06-04T00:00:00+00:00I just made a pull request to reqwest. I thought this particular one was interesting enough to be worth blogging about, so I am.
We know that many C++ family languages have a feature known as function overloading, where two functions or methods can exist with the same name but different argument types. It looks something like this:
void use_connector(ConnectorA conn) { // IMPL } void use_connector(ConnectorB conn) { // IMPL } The compiler then chooses which method to call, at compile-time, based on the static type of the argument.<p>I just made a <a href="https://github.com/seanmonstar/reqwest/pull/1553">pull request</a>
to <a href="https://github.com/seanmonstar/reqwest/">reqwest</a>. I thought this
particular one was interesting enough to be worth blogging about, so I am.</p>
<p>We know that many C++ family languages have a feature known as function
overloading, where two functions or methods can exist with the same name
but different argument types. It looks something like this:</p>
<pre tabindex="0"><code>void use_connector(ConnectorA conn) {
// IMPL
}
void use_connector(ConnectorB conn) {
// IMPL
}
</code></pre><p>The compiler then chooses which method to call, at compile-time, based
on the static type of the argument. In C++, this is part of compile-time
polymorphism, an easy “<code>if</code> statement” in the template meta-language. In
Java and many other languages, it’s merely a convenience, for when an
ad-hoc group of types are possible for what an outsider sees as the
same operation, but which from the perspective of the library requires
different implementations.</p>
<p>Rust does not support this, at least not in this form. This is a mildly
controversial decision; I’ve seen many people complain about it,
because it is a commonly-used feature in the languages they’ve come
from. Ultimately, I think Rust made the right call. There are too
many advantages of having a one-to-one correspondence between method or
function names and implementations, and ultimately I think the feature is
more confusing than helpful. <code>trait</code>s cover a lot of the same ability,
but in a more structured fashion, acting like C++’s compile-time
“<code>if</code>-statements.” But of course, there is always a learning curve giving
up a feature you’re used to using.</p>
<p>But just because Rust doesn’t officially support function
loading as a feature, surprisingly doesn’t mean that it’s
completely impossible. Recently, I was looking into the
depths of <a href="https://crates.io/crates/reqwest"><code>reqwest</code></a>,
trying to troubleshoot an issue, and I came across <a href="https://github.com/seanmonstar/reqwest/blob/d536ce261c6e92d9956cc9a3d28c4288046f454b/src/async_impl/client.rs#L1248-L1271">this
code</a>:</p>
<pre tabindex="0"><code>#[cfg(any(feature = "native-tls", feature = "__rustls",))]
#[cfg_attr(docsrs, doc(cfg(any(feature = "native-tls", feature = "rustls-tls"))))]
pub fn use_preconfigured_tls(mut self, tls: impl Any) -> ClientBuilder {
let mut tls = Some(tls);
#[cfg(feature = "native-tls")]
{
if let Some(conn) =
(&mut tls as &mut dyn Any).downcast_mut::<Option<native_tls_crate::TlsConnector>>()
{
let tls = conn.take().expect("is definitely Some");
let tls = crate::tls::TlsBackend::BuiltNativeTls(tls);
self.config.tls = tls;
return self;
}
}
#[cfg(feature = "__rustls")]
{
if let Some(conn) =
(&mut tls as &mut dyn Any).downcast_mut::<Option<rustls::ClientConfig>>()
{
let tls = conn.take().expect("is definitely Some");
let tls = crate::tls::TlsBackend::BuiltRustls(tls);
self.config.tls = tls;
return self;
}
}
// Otherwise, we don't recognize the TLS backend!
self.config.tls = crate::tls::TlsBackend::UnknownPreconfigured;
self
}
</code></pre><p>I was shocked to see this! I felt like I was reading Java.
My first thought was that this was the Java <code>instanceof</code> (anti-)pattern,
but after a little more thought, I realized that this in practice
would work out to function overloading.</p>
<p>Since this uses <code>impl Any</code> instead of <code>&mut dyn Any</code>, this function
will be monomorphized at compile-time, and I would expect that
the relevant branching would be collapsed, resulting in these
monomorphizations, written in an imaginary version of Rust where
function overloading is supported:</p>
<pre tabindex="0"><code>#[cfg(feature = "native-tls")]
pub fn use_preconfigured_tls(mut self, tls: native_tls_crate::TlsConnector) -> ClientBuilder {
let tls = crate::tls::TlsBackend::BuiltNativeTls(tls);
self.config.tls = tls;
self
}
#[cfg(feature = "__rustls")]
pub fn use_preconfigured_tls(mut self, tls: rustls::ClientConfig) -> ClientBuilder {
let tls = crate::tls::TlsBackend::BuiltRustls(tls);
self.config.tls = tls;
self
}
</code></pre><p>There is a wrinkle though. Unlike the Java or pseudo-Rust equivalent, the Rust
code in <code>reqwest</code> will still allow functions to compile if they specify
another type that is not one of the two supported. So you can call this
function with anything, even an <code>i32</code>, and the compiler won’t signal
an error or even a warning:</p>
<pre tabindex="0"><code>client_builder.use_preconfigured_tls(42); // COMPILES!
</code></pre><p>In this implementation, it eventually causes a run-time error instead (a
separate function produces it in the case of <code>UnknownPreconfigured</code>). But
this odd type-safety work-around still can’t be removed without breaking
API-compatibility. Code could theoretically be relying on this function
producing a run-time error in certain situations, or it could rely on
that other function not being called. Luckily, <code>reqwest</code> is not 1.0,
and I have reason to hope they won’t consider this problematic.</p>
<p>There are other ways to accomplish the same goal. Instead of an ad-hoc
list of supported types, this code could’ve used a <code>trait</code>. Such code
would look something like this:</p>
<pre tabindex="0"><code>pub trait TlsConfig {
fn to_tls_backend(self) -> crate::tls::TlsBackend;
}
#[cfg(feature = "native-tls")]
impl TlsConfig for native_tls_crate::TlsConnector {
fn to_tls_backend(self) -> crate::tls::TlsBackend {
crate::tls::TlsBackend::BuiltNativeTls(self)
}
}
#[cfg(feature = "__rustls")]
impl TlsConfig for rustls::ClientConfig {
fn to_tls_backend(self) -> crate::tls::TlsBackend {
crate::tls::TlsBackend::BuiltRustls(self)
}
}
pub fn use_preconfigured_tls(mut self, tls: impl Tls) -> ClientBuilder {
self.config.tls = tls.to_tls_backend();
self
}
</code></pre><p>This would allow the library to be used in the exact same way for valid
uses, but would still allow the compiler to catch invalid types. To be
sure, the <code>trait</code> and its <code>impl</code>s would have to be separated in the
code from the <code>use_preconfigured_tls</code> method, as you can’t put
a <code>trait</code> inside an <code>impl</code> block. But I think such an inconvenience
is worth the better type-safety.</p>
<p>My take-away here is to be wary of emulating features from other
programming languages, and also to be wary of <code>std::any</code>.</p>
<h1 id="addendumerrata">Addendum/Errata</h1>
<p>I was wrong about the existing code not providing a run-time error. It
sets an <code>enum</code> to <code>UnknownPreconfigured</code>, which then triggers a run-time
error elsewhere in a separate function. The article has been updated
accordingly.</p>
<p>The <code>trait</code> example code was also edited to reflect a version that actually
compiles, but not the final version in the MR.</p>
<p>I also edited the intro to clarify the relationship between function
overloading and traits.</p>
<p>The MR was ultimately <a href="https://github.com/seanmonstar/reqwest/pull/1553#issuecomment-1148059370">rejected</a> for reasons I deeply disagree with.</p>
Reviews and Reactions: 2022 Short Story Hugo Nomineeshttps://www.thecodedmessage.com/posts/hugo-2022/2022-06-01T00:00:00+00:00We decided to write up our thoughts on each of the short stories nominated for the 2022 Hugo awards. Of course, here be spoilers, spoilers galore. If you don’t want these stories spoiled, go read them, and then come back here.
This is the same concept as Jimmy’s review of the 2021 nominees, and so we shall adapt the explanation from that post:
As an exercise, we read each of these stories and told each other what we thought the themes were, and I reference that throughout these reflections.<p>We decided to write up our thoughts on each of the short stories nominated
for the 2022 Hugo awards. Of course, here be spoilers, spoilers galore. If
you don’t want these stories spoiled, go read them, and then come back
here.</p>
<p>This is the same concept as Jimmy’s <a href="https://www.thecodedmessage.com/posts/hugo-2021">review of the 2021
nominees</a>, and so we shall adapt the explanation from
that post:</p>
<p>As an exercise, we read each of these stories and told each
other what we thought the themes were, and I reference that throughout
these reflections. Themes, as we define them, are thematic statements:
the point the story is trying to make. Themes are distinct from thematic
concepts, in that they are complete sentences rather than just nouns.
They are distinct from premises, in that they are the take-away for
the real-world, not a statement about the world of the story. And, to be
clear, there can be more than one completely valid answer. Both of us would posit what we thought the theme was, answering independently
without consulting each other, and then we would discuss the story in
greater detail.</p>
<p>What follows are the tangible results of those discussions: reflections
about each story, somewhere between review and analysis. Each header is
also a link, because all of these stories are available to read online.
They are reviewed in descending ranked order according to Jimmy’s ranking,
and some overall discussion of ranking is reserved for the conclusions.</p>
<h1 id="mr-deathhttpsapex-magazinecomshort-fictionmr-death"><a href="https://apex-magazine.com/short-fiction/mr-death/">Mr. Death</a></h1>
<p>A trick ending, indeed. A relatively common trope, but unexpected here,
at least for us: In order to pass a test of morality, you have to refuse
an order, to not only do the right thing but do it in spite of what you
think will be horrible consequences to you. Can your conscience
survive dishonesty and manipulation?</p>
<p>It’s terrifying to see this trick done at the
“salvation/damnation” scale. It reminds Jimmy of <a href="https://www.smbc-comics.com/index.php?db=comics&id=1632">this
SMBC</a>. It sort
of calls into question the whole premise of “eternal damnation” and
“eternal punishment,” especially if the operators of these mechanisms
have values that disagree with ours, or are simply a result of arbitrary
but impersonal rules.</p>
<p>Given the twist at the end, it’s unclear what the rules of this story
are. How much wasn’t this reaper told? We understand why he was lied to
for the test, but now will he be given a more complete picture with
a new boss? Is he going to get more and more shocking revelations
every couple of eons? Unclear!</p>
<p>On the other hand, this story is a resounding endorsement of the theory
that you should always avoid doing something horrible, even if orders
compel you to do the horrible thing, even if it goes against the theory
you’ve been taught. We agreed that this was the theme. As Doug put it,
when the rules compel you to violate your conscience, violate the rules.
A good person’s conscience is usually the better guide than a good
rule book.</p>
<p>With life-or-death stakes such as these, this theme makes sense. There
is a balance, however. “Follow your conscience in all circumstances”
is bad advice when consciences are fallible, and sometimes the person
giving you the order is simply someone who knows more than you about
the situation. Who cannot say they held onto a stubborn but incorrect
rebellion against an authority as a kindergartner? Who truly has never
done it as an adult? Humility is actually a virtue. But Doug thinks that
this story’s power is that it has more faith in humanity’s ability
to intuit morality. In effect, the story is taking a powerful stand in
favor of <a href="https://www.qcc.cuny.edu/socialsciences/ppecorino/intro_text/Chapter%208%20Ethics/Utilitarianism.htm">act consequentialism versus rule
consequentialism</a>.
Doug is more inclined to support act consequentialism than Jimmy is.</p>
<p>This particular story could’ve gone a different way, Perhaps, by violating
the cosmic rule, all of time could have unravelled, or there could have
been a butterfly effect where someone else had to die as a result of
the protagonist’s decision. Such a story would have been written
by a different author who had less faith in humanity’s ability to
intuit morality and was a stronger proponent of a rule-based ethical
system. In such a story, the blame for the negative consequence would’ve
(in Jimmy’s eyes) definitely fallen on Raz for not explaining the stakes
and what would happen if the death was avoided. (Doug disagrees and thinks
that, in a truly rule-based system, the blame would still have been
on the protagonist. Raz would definitely be part of the causal chain,
though, and it would have behooved Raz to give a little more information.)
In the story as written, the narrator kept on asserting that you cannot
cheat death, without giving any evidence or specific reason. At the time,
this felt like there just wasn’t enough time to go into it for the story,
and it counted against the story. Now, it feels like foreshadowing,
and it is a strength of the story.</p>
<p>For Jimmy, while this story made him emotionally believe in the theme, and
while he greatly enjoyed this story and its subversion of the normal trope
of “don’t mess with forces greater than you, even to save a life, because
it could have even greater consequences,” he finds himself intellectually
not as convinced as he wants to be. As he thinks deeper about it, he
finds the questions brought up, and this story, somewhat unsettling.</p>
<p>Doug thinks that this is the story’s greatest strength, though.
This story forces the reader to confront a difficult moral question and
examine the consequences. Whenever a short story succeeds in making
the reader question an inherent moral belief, it deserves major kudos.
Go read this story.</p>
<h2 id="ranking">Ranking</h2>
<p>We both agreed that this was the best story, and so here
it comes ranked first, for interesting thought-provocation and quality
of writing with a twist at the end.</p>
<h1 id="where-oaken-hearts-do-gatherhttpswwwuncannymagazinecomarticlewhere-oaken-hearts-do-gather"><a href="https://www.uncannymagazine.com/article/where-oaken-hearts-do-gather/">Where Oaken Hearts Do Gather</a></h1>
<p>Jimmy (very much not Doug) had a lot of fun with this one! A satire of
Internet communities, where everyone jockeys to maintain their karma and
online reputations and fails to engage properly with the actual realities
of the situation at hand, where if they were paying more attention they
would realize that their more serious colleague was finding out how
truly important the thing actually was.</p>
<p>Themes include “People on the Internet are idiots more concerned with
their own reputation than the things they’re actually interested in,”
and also “there’s more to folklore than meets the eye” and “we can never
truly know the past.”</p>
<p>Because of its unusual structure, it was important for us to discuss the
plot so that we were on the same page about what happened; namely, that
a bunch of argumentative nerds are too busy trying to get Internet points
from each other to realize that the song being discussed is all-too-real
and another serious scholar is going to get his heart taken out.</p>
<h2 id="ranking-1">Ranking</h2>
<p>The juxtaposition of old folklore, scholarly academic discussion of
folklore, Internet arguing, and horror weaves a tight mesh that Jimmy
enjoyed greatly, as a fan of basically all those things. This won the
Nebula and, in Jimmy’s eyes, well deserved it; it’s a very close second
to Mr. Death in his mind. The form must have been incredibly difficult
to write: the opposite of lazy writing.</p>
<p>Doug, on the other hand, was a harsh critic of the story. It was actually
Doug’s least favorite story of the lot, and there were several Doug
really disliked this year. Doug’s biggest problem with the story was
that it seemed like it was just a Reddit conversation, with no character
development and a well-trodden plot (specifically, the bit about an old
folktale actually being real and youth not realizing it while someone
befalls a ghastly fate). Sure, the way in which the story was told was
super unique, but that artifice could not cover up the tired plot in
his eyes.</p>
<h1 id="the-sin-of-americahttpswwwuncannymagazinecomarticlethe-sin-of-america"><a href="https://www.uncannymagazine.com/article/the-sin-of-america/">The Sin of America</a></h1>
<p>This is a new retelling of “<a href="https://www.newyorker.com/magazine/1948/06/26/the-lottery">The
Lottery</a>”,
but with different themes for a different America. That is to say, both
pieces are satires of American culture, but in the years that have passed,
American culture has changed a lot. This author seemed to think an updated
version was called for, and given the new story, Jimmy is convinced.</p>
<p>The <a href="https://en.wikipedia.org/wiki/The_Lottery">Wikipedia article</a> on
“The Lottery” mentions two themes in it (or did at the time of this writing):</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/The_Lottery#Scapegoating_and_mob_mentality">Scapegoating and mob mentality</a></li>
<li><a href="https://en.wikipedia.org/wiki/The_Lottery#Blind_tradition">Blind tradition</a></li>
</ul>
<p>This seemed off to Jimmy, and so he went and reread “The Lottery,”
and found his suspicions confirmed: There’s next to nothing in there
about scapegoating; perhaps that’s how this tradition originated, but it
now seems to be a thin memory. And even when there is some elements of
mob mentality, it’s not a mob of anger, but a mob of raw traditionalist
energy. It’s really all about the second theme, which isn’t surprising,
as a short story normally only can support one theme. (Doug thinks that
the Wikipedia article isn’t wholly off base, but will stay mum here
while Jimmy makes his point!)</p>
<p>Jimmy would characterize the original “The Lottery” as if the author
wanted to say this: “Wow, America is obsessed with tradition. Do you even
know why you do the things you do? Do you know how much you’ve actually
changed the tradition, from previous countries, from the past? Do you know
how silly this all is? If tradition told you to jump off a bridge, would
you? If it told you to murder your friends, would you? Actually, yes,
I think you would. Here, let me write what it would look like. Doesn’t
that seem just like you?”</p>
<p>Jimmy remembers “The Lottery” both resonating with him and not. He grew up
in a town and a church with enough old-fashioned American traditionalism
left that he recognizes the particular flavor of traditionalism that it’s
satirizing, but he also thinks a lot of America, after “The Lottery,”
became too suburbanized and too detached from a sense of community to have
the same type of traditionalism, that community continuity has become so
shattered and so obsolete as a value that if anything we need more of what
“The Lottery” satirizes right now, not less of it. But that’s “tradition,”
and “The Lottery” satirizes “blind tradition,” which is generally bad. He
also thinks that tradition should be maintained thoughtfully.</p>
<p>But that’s a discussion for “The Lottery” itself,
not this spin-off. (Hear hear! says Doug) This
spin-off, unlike “The Lottery,” is clearly actually about
<a href="https://www.smbc-comics.com/comic/2014-12-22">scapegoating</a>, the ancient
Biblical practice of putting the sins of the community onto a goat which
was then sent away or forced to “[e]scape”:</p>
<blockquote>
<p>But the goat, on which the lot fell to be the scapegoat, shall be
presented alive before the LORD, to make an atonement with him, and to
let him go for a scapegoat into the wilderness.</p>
<ul>
<li>Leviticus 16:10 (King James Version)</li>
</ul>
</blockquote>
<p>The continued repetition of the word “sins” – this is a much more flowery
piece than “The Lottery” – makes abundantly clear the religious element,
and reminds us of Jesus, the Christian scapegoat, who dies for
everyone’s sins in a manner whose mechanism is somewhat unclear,
with many theories.</p>
<p>The oddest theory we’ve seen for why Jesus had to die was not that he
was a ransom or bait for Satan or that punishment must be carried out to
fulfill a divine requirement for justice, but instead that punishment had
to be carried out to fulfill a human requirement for justice. This theory
is naturally repulsive to most – basically, the
theory was that humans need someone to blame, and God signs up for the
role – and this seemed way too pessimistic an outlook on human nature –
and way less cosmic an event than we understand the crucifixion to be.</p>
<p>But however heterodox such a theory might be, that theory, applied to
a randomly selected human instead of to an incarnation of God, is the
logic, we think, behind the sacrifice in this short story. Humanity needs
someone to blame. America, specifically, needs someone to blame.</p>
<p>Well, yes, we kind of do. We’ve been developing a “great villain”
culture for a while now. Jimmy says, every President is set up to
be vilified by the other party, and it’s been escalating: Bush feels
tame for liberals now compared to how liberals feel about Trump, and
conservatives are now basically cussing at Joe Biden with the “Let’s
Go Brandon” line. Meanwhile, Jeff Bezos and Elon Musk make all kinds of
negative press for their stunts as rich people.</p>
<p>Within the story, all of the news that was blamed, at first on the
previous lottery “winner” and then on our protagonist, is reflective of
recent news. They threaten to turn it off, but then they don’t, because
in this America, rather than focusing on our day-to-day problems – and
the characters in the story had many, many problems – we feel instead
like it’s more appropriate to blame the figures in the national news.</p>
<p>And this is because, according to the author, we as Americans feel
hopeless. Why try to get a better job? The ultra-rich and their government
cronies will prevent it anyway. Why try to buy a house? Capitalism has
prevented millennials from succeeding. If we’re not able to succeed
in our personal lives, then why not find somewhere else to focus our
attention and our passions? This story is just a vivid depiction of what
we’re already doing.</p>
<p>Doug found this story to be a bit heavy-handed, but it was a coherent
story with a clear point. It is very much The Lottery, updated to be told
by a liberal who has come to see America as more of a nation of problems
than the land of the free and the home of the brave. Doug worries that
this story reflects a belief by some in our country that America is no
longer a place worth saving. It is, and the hopelessness and anger felt
at our country by this story’s author portends something awful for
this country. Doug hopes the author is in a minority and that people
in this country can find a renewed sense of pride and optimism, finding
solutions to America’s many problems instead of giving up hope.</p>
<h1 id="proof-by-inductionhttpswwwuncannymagazinecomarticleproof-by-induction"><a href="https://www.uncannymagazine.com/article/proof-by-induction/">Proof by Induction</a></h1>
<p>This story starts with an assertion:</p>
<blockquote>
<p>The Coda cannot change in the way that a person can, however; it cannot
learn or grow. Your father’s soul is not in there. Your father has
moved on.</p>
</blockquote>
<p>It is put in the mouth of a Presbyterian minister, and so Jimmy’s
immediate instinct was to question it. (Doug barely noticed this
part of the story until after Jimmy brought it up as a focal point.)
The chaplain is obviously biased, trying to uphold her religious views,
trying to defend her traditional notion of an afterlife against an
upstart competitor. Jimmy hopes that perhaps this story will balance
her perspective against a different perspective and take sides.</p>
<p>Later, when we find out that the simulation restarts upon every entrance,
we found ourselves wondering if it’s perhaps been programmed to do so
to prevent people from taking it to seriously, as we see no particular
reason why it should work like that. Perhaps our protagonist can change
the programming, as it is clear to us that, in the real world, this
would be a programming choice and not a fundamental design constraint
of the Coda. (Of course, this is not the real world…)</p>
<p>As the protagonist tries to repair his emotional connection with his
father, we wondered if out of frustration he might hack the darn thing
to remove the restriction and receive some closure. It does seem like
he’s making some progress partway through the story, but it’s erased by
the plot contrivance.</p>
<p>Whatever this story is trying to say, it is held back by this contrivance.
Is it trying to say that immortality is impossible and anything that
pretends to it is a simulacrum? If so, then the weird unexplained
technical limitation gets in the way of that point. Is it trying to say
that even an afterlife wouldn’t help you fix your relationship with your
parent? That might be true, but again, the contrivance makes us (Jimmy
especially) feel like our protagonist hasn’t been given a fair shot.</p>
<p>Most of what Jimmy gets out of this story is “it would suck if someone
created a very realistic afterlife technology and put an arbitrary
limitation on it, because people would find it very frustrating.”</p>
<p>But also, why is the Presbyterian minister allowed to just proclaim things
about this unquestioned, that might as well have come from the narrator?
Why is anyone who isn’t extremely religious taking her approach? Why
isn’t everyone debating whether these Codas have rights? Why aren’t
they protesting in the streets? Why is the only person to consider the
intellectual implications a random-ass mathematician rather than the
writer of a think-piece from when it’s still in development? Is this
some sort of totalitarian Presbyterian dictatorship?</p>
<p>Science Fiction is supposed to propose hypotheticals and then explore
the consequences, or at least it’s supposed to come across that way from
the reader. The theme must flow from the premise logically. This seems
more like an attempt to make a point, and then contrive a hypothetical to
prove it, and it is so contrived that I’m having a hard time discerning
what the point even is.</p>
<p>But we recognize that’s not how this story works. This premise was
tailor-made to demonstrate that sometimes, no matter how much we think
we’re making progress with another person, they’ll just revert to their
old ways the next time we see them. It’s as if the author was talking
about their actual parent, and saying that from their behavior, they might
as well be in a form of death, where the memory is lost each time they
see them and progress is impossible. This can be taken as a portrait
of that frustration, but due to the unbelievability of the premise,
it was difficult for Jimmy to take it that way.</p>
<p>In Doug’s eyes, the theme of this story was: “People won’t change
just because you want them to. Value people for who they are.”
Doug was not as harsh as Jimmy was on this story, but he thinks that
is because he generally tends to “softer” sci-fi that cares less
about the reality of the underlying science or the technical elements.
Unlike most of the other stories in this lot, this story involved some
real character development, and a family relationship that felt super
real. Indeed, that’s why we mentioned that it felt like the author
was working through the author’s own family issues. In Doug’s eyes,
this was a strength of the story.</p>
<h2 id="ranking-2">Ranking</h2>
<p>Due to the extreme unbelievability of the premise, Jimmy ranks this lower
than he otherwise would have. Doug ranks it much higher, but Jimmy simply
refuses to accept that society would create an invention so powerful just
to use it mostly for finding documents and quick good-byes, and he thinks
the resets-every-time thing is a contrivance.</p>
<h1 id="unknown-numberhttpsnitteritazure_writingstatus1452324530182033413"><a href="https://nitter.it/Azure_Writing/status/1452324530182033413">Unknown Number</a></h1>
<p>First, a caveat: we are not trans and probably have gaps in our knowledge
about what it’s like to be trans. We are trying really hard to discuss
this story accurately, but are not entirely sure of my choice of
terminology or perspective. Please, send corrections if warranted!
We would like to learn more.</p>
<p>The premise would fit as a specific example of
the many lives that are touched by inter-universe
communication in Ted Chiang’s <a href="https://onezero.medium.com/anxiety-is-the-dizziness-of-freedom-b5ab45cae2a5">Anxiety is the Dizziness of
Freedom</a>,
one of Jimmy’s favorite science fiction novellas, and also a Hugo nominee for
Best Novella in 2020 (Doug hasn’t read it… Don’t get mad). Unlike that novella, which was a rather realist
take on “what would alternate-universe technology do in a real way to
society,” this story gives the alternate-universe communication technology
to exactly one person, so they can talk specifically about gender.</p>
<p>This serves as a window into what it’s like to be trans. Being trans
often involves a huge decision: Whether to transition, and whether to
change your public gender identity, your pronouns, etc. It is a high
risk/high reward decision, and so the alternate universe model is very
fitting for it. We both regularly imagine the alternate universes
created by alternate answers to our big decisions in life, and we wish
we could talk to those alternate versions of ourselves and see how that
had gone. So we imagine if we had gender dysphoria, we’d like to talk
to the alternate universes where we did or did not transition.</p>
<p>Well, in this story, only the version who didn’t transition reached out.
This means that transitioning <em>worked</em> and fixed the problem, which is,
we think, the theme, specific to the trans experience: If you are trans,
you should come out/transition. You won’t regret it.</p>
<p>And more generally: Big decisions are hard, and they do have an impact
on your life, but to be a coward is a decision in and of itself. Be
bold.</p>
<h2 id="ranking-3">Ranking</h2>
<p>We agreed that the text message format wasn’t that interesting, and
simply got the author out of having to write more detailed description.
Jimmy wouldn’t say it was lazy (Doug probably wouldn’t either), but we
do think the effort was put in to get a particular point across, not to
develop a rich world and story. Similarly, the characters, both versions
of the same person, are somewhat bare-bones, and the story itself gets a
little repetitive. We think it’s really good for a Twitter post trying
to convey a point about the trans experience and major life decisions
in general, but not rich or well-developed enough for this list.</p>
<h1 id="tangleshttpsmagicwizardscomenarticlesarchivemagic-storytangles-2021-09-03"><a href="https://magic.wizards.com/en/articles/archive/magic-story/tangles-2021-09-03">Tangles</a></h1>
<p>This is <em>Magic: The Gathering</em> fanfic, and we mutually know basically
nothing about <em>Magic: The Gathering</em>, so we will simply discuss this as
outsiders – which we literally are.</p>
<p>As an outsider, we found it very difficult to read. We both procrastinated
reading it (Jimmy for almost two weeks, much to Doug’s chagrin). The aesthetic
and the world here does little for us, and we can’t tell what is novel to
the story and what is a reference that fans will get excited about,
which is disorienting and makes it difficult to enjoy.</p>
<p>The specific concept of a dryad needing a tree to survive just feels
like a metaphor for a toxic way of thinking about relationships, which
is immediately off-putting, so the premise immediately bothers us.</p>
<p>The plot, at its base, strikes us as somewhat better: Two people (using an
expansive definition of “people”) meet, both in their own life-or-death
level crisis. By cooperating and making “peace” between them (is the
word “peace” repeated so much for thematic or world-building reasons?), they
both manage to solve some of their problems, which they would have been
unable to solve separately. The theme, then, is “work together even in
emergencies,” which is a good moral lesson that many people need to hear.</p>
<h2 id="ranking-4">Ranking</h2>
<p>This seems written primarily for people who will be inordinately
excited by the concepts of dryads and by having a story set in a
<em>Magic: The Gathering</em> world, and we are very much so not that person.
We also found it tedious to read, and none of the characters felt
like characters, so Jimmy leaves it last, and Doug next to last.</p>
<h1 id="final-ranking-comparison">Final Ranking Comparison</h1>
<h2 id="jimmy">Jimmy</h2>
<ol>
<li>Mr. Death</li>
<li>Where Oaken Hearts</li>
<li>Sin of America</li>
<li>Proof by Induction</li>
<li>Unknown Number (Twitter)</li>
<li>Tangles (MTG)</li>
</ol>
<h2 id="doug">Doug</h2>
<ol>
<li>Mr. Death</li>
<li><em>(after a huge dropoff)</em> Proof by Induction</li>
<li>Sin of America</li>
<li>Unknown Number (Twitter)</li>
<li>Tangles (MTG)</li>
<li>Where Oaken Hearts (but I had a lot of trouble ranking this last one)</li>
</ol>
<h1 id="conclusion-a-note-on-2022-vs-2021">Conclusion: A Note on 2022 vs 2021</h1>
<p>Given that we previously <a href="https://www.thecodedmessage.com/posts/hugo-2021/">reviewed the 2021 stories</a>,
let’s compare these as a set.</p>
<p>Overall, we both believed the 2022 nominees were all around weaker
than the 2021 nominees. Last year’s set had several strong stories
(“Metal Like Blood in the Dark”, “Little Free Mermaid”, and “Open House on
Haunted Hill”, all spring to mind). We could see how even the stories
we personally weren’t as crazy about in 2021 had strong merits and
would appeal to particular folks.</p>
<p>In comparison, the 2022 stories felt like a bit of a letdown. In Doug’s
eyes, “Mr. Death” is really the only story worth reading in this
whole lot, and while Doug likes “Mr. Death” more than any story in
last year’s set, that by itself can’t carry the day. Doug is also
concerned that the inclusion of some of the more atypical stories in this
year’s set (“Tangles” and “Unknown Number”) signals that the nominators
are too green and fanfic-y. And “Sin of America” (much like “Badass Moms”
from last year) seems included mostly because it appeals to a particular
political mentality.</p>
<p>Notably, the Nebula Award nominees did not include any of these three
stories this year (although they did nominate “Badass Moms” last year).
The three stories that were cross-nominated for the Hugo and Nebula were
Mr. Death, Where Oaken Hearts, and Proof by Induction, all of which do
seem like deserved nominees (even though Doug really disliked Where Oaken
Hearts and Jimmy really disliked Proof by Induction, we both recognize
that these respective stories were good, just designed for people who
care about different things in their stories). Here’s hoping for a
better lot for our next post!</p>
Netflix Should Become a Tech Companyhttps://www.thecodedmessage.com/posts/netflix-tech/2022-05-27T00:00:00+00:00Netflix should become a tech company.
I hear the obvious response already: Jimmy, Netflix is already a tech company!
Counterpoint: Is it though?
Somehow, after two dot-com booms, the markets still have an aesthetic-based definition of what constitutes a “tech company”: If a company – any company – has an expensive enough app, and if its founders talk enough about “disrupting” industries, then it is a “tech company” and is therefore entitled to a valuation completely disconnected from its actual industry.<p>Netflix should become a tech company.</p>
<p>I hear the obvious response already: Jimmy, Netflix is already
a tech company!</p>
<p>Counterpoint: Is it though?</p>
<p>Somehow, after two dot-com booms, the markets still have an
aesthetic-based definition of what constitutes a “tech company”: If a
company – any company – has an expensive enough app, and if its founders
talk enough about “disrupting” industries, then it is a “tech company” and
is therefore entitled to a valuation completely disconnected from
its actual industry. Think WeWork – and think what happened to it as people
gradually realized it wasn’t an exciting tech start-up but rather a quite
boring real estate company. Turns out, you don’t need an expensive app
to run a coworking space.</p>
<p>A friend of mine pointed this out to me recently, claiming that the whole
concept of a tech company was a façade. WeWork was the obvious example,
but there are others: Uber and Lyft are taxi dispatchers, GrubHub (known
in NYC by its other brand, Seamless) is a take-out catalogue, and Netflix
is a premium channel. And Amazon’s more famous business (more on this
later) is to be a retailer: its competitors are Wal-Mart and Target,
or else mail-order catalogues.</p>
<p>Sure, all of these companies use phones and apps to do their thing
better, sometimes uselessly (like WeWork), sometimes “disruptively” so,
genuinely transforming the industry (like Amazon or Uber). But their
thing is something that people have done before them, and will continue
to do after them. At this point, doing something “with an app” should
be as surprising as doing that thing “over the phone” or “using writing.”</p>
<p>Another class of companies is harder to categorize. Facebook and
Twitter are doing things that would be impossible before the web, but
fundamentally are not about providing technology either. They manage and
organize content, and in so doing, get the ability to suggest sponsored
content, to – as Zuckerberg informed the Senate – sell ads. They are
content companies or web companies.</p>
<p>But, as I pointed out to my friend, this doesn’t mean there’s no such
thing as a tech company, whose job is to provide technical infrastructure
in the computing world. These tend to be older names: There’s IBM,
which makes mainframes and whose subsidiary Red Hat maintains a Linux
distribution. There’s Oracle, which licenses its database software
that underpins a shockingly large slice of our economy. There’s Microsoft,
which still maintains Windows even though it isn’t cool anymore, and
Excel, the most popular programming language in the world that isn’t
even branded as a programming language.</p>
<p>So now that we’ve established this dichotomy between true “tech
companies” and “companies that do their business with an app,” let’s
look at the corner cases:</p>
<p>Google in my mind straddles the line between content management – e.g.,
YouTube, GMail, Search (I would argue) – and tech – e.g, Android and
Chrome – with the odd caveat that the tech is what they give away for
free and the content management is how they make their money.</p>
<p>Amazon, in my mind, is a more interesting corner case. Their famous
retail website, <a href="https://www.amazon.com/">amazon.com</a>, is, like Uber
or AirBnB, an example of a normal business, but with an app. However,
their app required so much technology that they have a major business in
providing some of that technology to others as a cloud company: AWS.
Whether or not “the cloud” is overhyped – and I think it really is
important – a cloud company is definitely a technology company.</p>
<p>And this leads me back to my title topic: Netflix.</p>
<p>Netflix has been having a rough couple months. At the time of
this writing, it has lost 72% of its valuation in the past 6
months, which is a lot even for this recent bear market; in the
same time period, the NASDAQ index only lost 28.18%, and the S&P
500 only 13.84%. It lost subscribers for the first time in its
history, and for many, it’s clear that the <a href="https://en.wikipedia.org/wiki/Belshazzar%27s_feast">writing is on the
wall</a>.</p>
<p>And this corresponds to a frustration with Netflix as a content platform
that I’ve noticed anecdotally among my friends. It’s been a long time
since it’s been the go-to for streaming, when almost every show was either
on Netflix or not streamable, and most popular shows were on Netflix:
“What service is it on?” is now a more common question than “Is it on
Netflix?” And many people I know anecdotally are questioning whether
Netflix, now one premium channel among many, is even worth keeping in
their streaming portfolio. Do we really still like its TV shows, or do
we just keep it on there out of loyalty to what it used to stand for?</p>
<p>And Netflix will soon force many people to make that decision, by
cracking down on password-sharing. Well, Netflix, you might not like
the results. You’re coasting right now, and you might be out of gas;
now might be a bad time to put on the breaks.</p>
<p>But oh, how the mighty have fallen! Netflix was a smashing success as
a company when streaming was a novelty. As the Internet developed
the necessary bandwidth – as Netflix helped force the Internet to
improve its infrastructure – streaming exploded. Most people pirated
for convenience rather than to save money, and streaming was even more
convenient without any of those pesky moral concerns.</p>
<p>At the time, the goal clearly was for Netflix to be the sole streaming
provider, licensing from traditional channels and movie producers,
and being the one subscription every household needed, simultaneously
creating and monopolizing the concept of streaming.</p>
<p>And it worked, for a while, but the traditional content providers were
not so easily displaced, and competition was not so easily avoided.
HBO was one of the early competitors, but now they are legion: Hulu,
Disney Plus, Amazon, Apple TV, YouTube TV…</p>
<p>The new streaming market was too big for Netflix to hold onto. When
streaming was a novelty used by a significant but not overwhelming number
of households, it made sense for content creators to work through Netflix
to reach that slice of customers. Now that almost everyone streams,
it’s not just a slice of the market anymore, and it makes more sense
for content creators to try and work around Netflix’s attempted monopoly
and make their own streaming service.</p>
<p>And that’s a shame. Don’t get me wrong, I have no love for monopolization.
But Netflix’s technology is simply better than all the other streaming
companies’. Every single other provider simply has a worse user experience.</p>
<p>If you’re a regular streaming user, you’ll have already noticed this,
since the Netflix apps work, and the others are merely workable enough.
Glitches, lags, buttons that don’t work right plague the other streaming
apps, whereas Netflix just works, especially on the web (SmartTV platforms
sometimes have other issues).</p>
<p>And not only has Netflix spent more money and done a better job at
polishing the user interface, they have worked really hard to <a href="https://openconnect.netflix.com/en/">collaborate
with ISPs</a> to store your videos as
close to you in the network as possible, speeding up loads, decreasing
lag, and increasing video quality. Other video streamers have not
caught up, and I fear they will never do as good a job as Netflix –
each one individually simply would not have the same bargaining power
with ISPs that Netflix once had, and “good-enough” tech will be the
standard at companies like Disney and HBO that never considered themselves
tech companies.</p>
<p>As a programmer, I’ve heard great things about their tech, and heard
it’s a great workplace for programmers (as opposed to content creators),
but if I worked there, it would make me sad to know that my
work, rather than improving everyone’s streaming experience, would
only improve the experience on one second-rate streaming channel.</p>
<p>What if – and hear me out now – what if Netflix licensed its technology
to other streaming providers? What if whenever you used the Disney Plus
app or the HBO app, Netflix code ran and cached your content on Netflix
colocated servers and played it in Netflix’s video player? I wouldn’t
want it to be every streaming provider, but enough that the quality
could go up. The OG Netflix could just be one premium channel
“by Netflix Technologies” among many. In fact, the company could
even split, so that potential clients don’t think that the original Netflix
channel would get preferential treatment.</p>
<p>It would take time to do this transition. Maybe the channel should keep
the original name for momentum, and the tech spin-off adapt a new name.
Maybe it can start talks with the other platforms immediately about
technology sharing. If I were in charge of another streaming platform,
I’d definitely want a slice of that tech.</p>
<p>And maybe it wouldn’t work. Maybe the other streaming providers think
their crappy streaming technology is “good enough.” This pivot would
be a risky move, but the current stock price calls for it.</p>
<p>I understand why Netflix hasn’t done this before now. Monopolizing
streaming seemed like a realistic goal for most of its history. But in
the end, it failed. It would be a shame for its excellent technology
to fail with it.</p>
<p>Disclosure: I own no position of NFLX whatsoever, but maybe I should
get myself a short position, because I know they won’t do this. I just
really wish they would. Goodbye, Netflix! We’ll remember you fondly!</p>
Can you have too many programming language features?https://www.thecodedmessage.com/posts/2022-05-11-programming---multiparadigm/2022-05-11T00:00:00+00:00There’s more than one way to do it.
Perl motto There should be one– and preferably only one –obvious way to do it.
The Zen of Python (inconsistent formatting is part of the quote) When it comes to statically-typed systems programming languages, C++ is the Perl, and Rust is the Python. In this post, the next installment of my Rust vs C++ series, I will attempt to explain why C++’s feature-set is problematic, and explain how Rust does better.<blockquote>
<p>There’s more than one way to do it.</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/There%27s_more_than_one_way_to_do_it">Perl motto</a></li>
</ul>
</blockquote>
<blockquote>
<p>There should be one– and preferably only one –obvious way to do it.</p>
<ul>
<li><a href="https://peps.python.org/pep-0020/">The Zen of Python</a> (inconsistent
formatting is part of the quote)</li>
</ul>
</blockquote>
<p>When it comes to statically-typed systems programming languages, C++ is
the Perl, and Rust is the Python. In this post, the next installment of
my <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">Rust vs C++</a> series, I will attempt to explain
why C++’s feature-set is problematic, and explain how Rust does better.</p>
<p>C++ fans brag that it is “multi-paradigm,” and it is. You can do
everything the C way, as C++ has a subset almost exactly identical to
C. You can use pointers and virtual functions and inheritance to create
all the classic OOP design patterns, as C++ is object-oriented. Or you
can use templates, and “static” or “compile-time” polymorphism, and
program that way.</p>
<p>At first glance, this all seems like an unmitigated good thing, because
it gives you, as a programmer, flexibility. You can express your code
in OOP style if that matches the problem at hand, or even if you
just like it better. If you need the performance of
templates, you can use them, and if you don’t (or you just find them
confusing), you can use run-time polymorphism instead. Or you
can just ignore all of it, and program in almost-plain C.
Flexibility is good: you can use the features you want, and not
use the features you don’t want. Even if a feature is downright
harmful, in your opinion, that’s easy enough to handle: Just don’t
use it.</p>
<p>And this is all very well and good if you’re programming a quick project
completely by yourself. But most code comes in long-lived projects, with
developers jumping in and out of the project all the time. In such an
environment, as Robert C. Martin puts it:</p>
<blockquote>
<p>“Indeed, the ratio of time spent reading versus writing is well over
10 to 1. We are constantly reading old code as part of the effort to
write new code. …[Therefore,] making it easy to read makes it easier
to write.”</p>
<ul>
<li>Robert C. Martin, <em>Clean Code: A Handbook of Agile Software Craftsmanship</em></li>
</ul>
</blockquote>
<p>(Sidenote: I will admit to knowing almost nothing about Robert C. Martin
besides this famous quote. I have no idea if the rest of his work is
as insightful as this quote, or not, and will probably try to find out
someday, but not today.)</p>
<p>Since programmers in general spend much more time reading code than
writing it, we very rarely actually get to reap the benefits of this
flexibility as writers. Much more often, as maintainers and readers,
we have to be flexible ourselves. We have to be ready to read code
in any style, in any paradigm, using any feature-set.</p>
<p>This is why Perl was commonly panned as a write-only programming language:
It had so many features that you could not be up to speed on all of them.
Each programmer at each point in time had a set that they used, but
no one could ever get proficient at working in the entire available
feature-set.</p>
<p>In Perl, the features were syntactic, so the programs would be unreadable
at a line-by-line level. In C++, the different features have more to do
with code organization, which is harder to make fun of, but I think
more insidious, because a lot of the features are structural.</p>
<p>Let me explain what I mean. Let’s say you’re a C++ maintenance programmer,
and you don’t like exceptions. You’re trying to maintain a program
that uses exceptions heavily, and add new features to it. Not only do
you have to be able to understand exceptions to read the code, you have
to write your own code so that it handles the exceptions where appropriate,
and so that it’s exception-safe. Even if you’re just using a third-party
library that throws exceptions, you have to understand exceptions to use
that library.</p>
<p>The entire programming language, with all the features, is part of
the necessary skill-set to program proficiently. Even if it is just
you writing your own project, you still will have to use libraries,
and the features involved with it. And even if it is just you, if the
project lives long enough, you will have to deal with your previous
decisions. Migrating from dynamic to static polymorphism in C++ is no
joke. Ask me how I know.</p>
<p>And of course, every feature has to be considered when writing advice.
Every best practices manual for C++ is written for C++, not a subset
of C++ features. The more things it’s possible for a future programmer
or future library writer to do, the more things you have to worry
about coding defensively, and the more things that have to be included
in best practices manuals, and finally the more things that a proficient
programmer has to stuff into their brain.</p>
<h1 id="specific-c-examples-rust-responses">Specific C++ Examples, Rust Responses</h1>
<p>But I’m also not trying to advocate for absolute minimalism. There may
be a cost to every feature, and it may be that no feature is optional,
but that doesn’t mean that we should have the bare minimum number of
features. Sometimes the cognitive and maintenance cost of a seemingly
extraneous feature is still worth it. Especially in a systems programming
context, different problems often do actually call for different
implementation strategies with different programming language features
to express them.</p>
<p>C++, however, does this poorly. I’m not even sure I’d claim that C++
has too many features; it’s more that the features are not consistent.
They clash with each other. Different feature-sets make assumptions
that are violated by other feature-sets. C++ is not designed with
the costs of extra features in mind, and as such, the features
cost more than they have to.</p>
<p>Let’s discuss a few specific ways in which C++’s features cause
problems and clash with each other. For each of these categories,
I then discuss how Rust handles the same topic, with a more
coherently-designed feature set.</p>
<h2 id="value-and-reference-semantics-slicing">Value and Reference Semantics: Slicing</h2>
<p>Slicing is a famous beginner error in C++, where the semantics of
combining certain features are surprising with a tendency to break
invariants, but no diagnostics are issued as the code is completely valid.
Perhaps unsurprisingly, this code comes from a mismatch between
two C++ features designed for two C++ programming styles.</p>
<p>Specifically, C++ has a distinction between value and reference semantics.</p>
<p>With value semantics, you can use operator overloading to make
your custom class look and act like a built-in type, supporting
operators like <code>+</code> and <code>+=</code>:</p>
<pre tabindex="0"><code>class complex {
double re;
double im;
public:
complex &operator=(const complex &other) {
re = other.re;
im = other.im;
return *this;
}
complex &operator+=(const complex &other) {
re += other.re;
im += other.im;
return *this;
}
complex operator+(const complex &other) {
complex res = *this;
res += other;
return res;
}
};
// Sample usage
Complex a, b;
a = b;
Complex c = a + b;
</code></pre><p>With reference semantics, you can use polymorphism to create
many different types of object that support the same interface.
You can then access these objects through pointers or references
to the base class.</p>
<pre tabindex="0"><code>class Complex {
protected:
double re;
double im;
public:
virtual double getMagnitude() {
return sqrt(re * re + im * im);
}
}
class Quaternion : public Complex {
protected:
double j;
double k;
public:
double getMagnitude() override {
return sqrt(re * re + im * im + j * j + k * k);
}
}
// Sample usage
void print_magnitude(Complex &c) {
std::cout << c.getMagnitude() << std::endl;
}
Quaternion a;
Complex b;
print_magnitude(a);
print_magnitude(b);
</code></pre><p>However, these two programming techniques cannot be combined.
You cannot assign a <code>Complex</code> object a <code>Quaternion</code> value:</p>
<pre tabindex="0"><code>Quaternion a;
Complex b;
b = a; // Non-sensical
</code></pre><p>Why? Well, unlike in Java, <code>Complex b</code> actually allocates the space for a
<code>Complex</code> number as a local variable on the stack. This means that it
only has room for the two fields, <code>re</code> and <code>im</code>.</p>
<p>But, unfortunately, if you include all the methods from both examples,
that code will compile, and run, and <code>b</code> will have only <code>re</code> and <code>im</code>
from <code>a</code>. This is almost certainly not what you want, and may
in fact break invariants (e.g. for this you might only be dealing
with values of magnitude 1, and this truncation would lower the
magnitude).</p>
<p>This comes from two alternative paradigms for objects: by value as
“primitive replacement,” where <code>Complex</code> can be used like an <code>int</code>, and
by reference with traditional OOP inheritance and polymorphism. These
paradigms don’t use different keywords, however. They can just all
be used in the same objects, causing this trouble.</p>
<p>Advice on how to prevent this includes rules like “give
all parent classes at least one pure virtual function,”
which would make <code>Complex b</code> as a by-value declaration
illegal. But if this rule is recommended in leading <a href="https://www.amazon.com/Effective-Specific-Improve-Programs-Designs/dp/0321334876">books on
C++</a>,
why isn’t it enforced in the programming language itself?</p>
<h3 id="how-rust-handles-this">How Rust Handles This</h3>
<p>C++’s slicing is caused by a conflict between two features, inheritance
and assignment. Rust handles both of those features
differently, so that they do not conflict.</p>
<p>So the most important difference here between Rust and C++ is that
Rust does not have implementation inheritance like C++ does. For
two given C++ concrete types, one of which surrounds the other,
there are two possible relationships between them: is-a, and has-a.
Rust only does has-a for concrete types.</p>
<p>C++ inheritance is a feature with many use cases, such as sharing
implementation, implementing policy, and implementing interfaces (what Rust
calls traits). Rust, rather than having one big broad feature,
instead implements individual features as appropriate. The closest
feature Rust has to inheritance is traits (including subtraits
and supertraits), but because traits are not concrete types, they
cannot be assigned, and so this issue is avoided.</p>
<p>But also in assignments, Rust implements a simpler feature that is
easier to reason about: Rust does not allow custom assignment operators.
Rust instead builds assignment out of two operations: move, and drop
(cf. C++ destructors). If drop is implemented correctly, so will
assignment. If you want to copy instead of move, you have to explicitly
call a <code>clone()</code> method. And moves are <a href="https://www.thecodedmessage.com/posts/cpp-move/">not
customizable either</a>.</p>
<p>So, although Rust has some of the best parts of inheritance in traits,
and still allows assignment of custom types (but through customizing drop,
not assignment <em>per se</em>), it avoids this particular clash through
restricting the scope of those features.</p>
<h2 id="exceptions-and-exception-safety">Exceptions and “Exception Safety”</h2>
<p>It would be impossible to write a post criticizing C++ for
its problematic feature-clashes and not talk some
about exceptions.</p>
<p>Exceptions are another famous example of a C++ feature you simply can’t
“not use.”</p>
<p>Exceptions are viral by nature. If you call a function that might
throw and don’t catch all the exceptions that it throws – which might
be impossible to determine – then your function can throw as well.</p>
<p>And lots of functions can cause exceptions. Allocating memory indicates
failure via exception. Exceptions are the only way for constructors to
signal failure, and C++ idiom encourages constructors to be written in
such a way that success guarantees that the object is usable. The
programming language was clearly not designed to be used without
exceptions.</p>
<p>But exceptions are gnarly and confusing. I already know people will
comment to this post and say that if you write and structure C++ code
correctly, it will be exception-safe. And that’s almost trivially true,
since exception safety is part of correct C++ practice, but it’s not
easy and it doesn’t follow naturally from easy-to-learn principles,
which is why Herb Sutter, a huge name in C++, felt the need to write
<a href="https://www.amazon.com/Exceptional-Engineering-Programming-Problems-Solutions/dp/0201615622">two</a>
<a href="https://www.amazon.com/More-Exceptional-Engineering-Programming-Solutions/dp/020170434X">books</a>
about it. Of course, in practice, people just write exception-unsafe code,
all the time.</p>
<p>Every time you call a function – which can happen in C++ simply by
declaring a variable, or even by ending a scope (though destructors are
supposed to avoid throwing exceptions) – you have to worry about whether
that function throws an exception, and if you’re leaving things within
that function in an inconsistent state. In C++, it is very common to
implement your own unsafe data structures, and exceptions are designed
to be sometimes recoverable from. Lack of exception safety can mean
memory corruption or even exploitable security vulnerabilities.</p>
<p>No wonder a lot of codebases ban exceptions. Unfortunately, many shops
simply avoid using exceptions instead of banning them, leaving exceptions
possible. Also, code from “exception-free” codebases can then later
be mixed back in with regular C++, re-opening it to exception-safety
concerns.</p>
<p>The fact that exceptions are so controversial can lead to
confusion as well. Consider this function signature:</p>
<pre tabindex="0"><code>std::unique_ptr<DatabaseConnection> connect(const ConnectionParameters&);
</code></pre><p>How does this function indicate failure? From the signature, there are
two possibilities: It could either return <code>nullptr</code>, or it could throw
an exception. Hopefully the documentation would clarify – but again,
oftentimes, people don’t write documentation, especially for internal
APIs.</p>
<h3 id="how-rust-handles-this-1">How Rust Handles This</h3>
<h4 id="normal-error-handling-in-rust">Normal Error Handling in Rust</h4>
<p>For recoverable errors, Rust encodes them in the type. Rust’s
equivalent to <code>std::unique_ptr</code> – <code>Box</code> – is not nullable. If we
want to return one, but possibly also signal an error, we use
a sum type or what Rust calls an <code>enum</code>, and what C++ would call a
“tagged union” and make you implement by hand:</p>
<pre tabindex="0"><code>fn connect(param: &ConnectionParameters) ->
Result<Box<DatabaseConnection>, OurError>;
</code></pre><p>This means that it can return either a database connection or an error.
This is the convention to return any error condition that is recoverable,
which is half of what exceptions are used for in C++. Since <code>Box</code> is not
nullable, you have to say more than just <code>Box</code> to signal that it’s
possible to return an error, proving that you really mean it.</p>
<p>For unrecoverable exceptions – for situations like logic and programming
errors that the program has caught – Rust has panics, which work
much more like C++ exceptions in practice.</p>
<h4 id="panic-safety">Panic Safety</h4>
<p>Rust afficionados will know that Rust has not escaped exception
safety, having instead an analogous notion of “panic safety.” How,
then, can I criticize C++ so boldly?</p>
<p>There are two notable differences between C++ exceptions and Rust panics.
The first is that Rust panics are used primarily for unrecoverable
errors, such as errors that indicate that a programmer’s assumptions
were violated due to a bug or a circumstance that the program cannot
recover from or a misunderstanding from the programmer. These generally
are unrecoverable, and Rust by convention uses a different mechanism,
<code>Result</code>s, for recoverable errors. So most Rust code doesn’t have to
care about maintaining invariants in the face of panics, because most
Rust code can presume that if it panics, that’s the end. This is better
scoping for the panic feature, as opposed to exceptions.</p>
<p>But the fact remains that panics can be recovered from, and do still
do stack unwinding and destructor/drop calls, and safety issues can
still exist. Panics in Rust can cause memory corruption – in unsafe
code. And that’s where panic safety really still matters: in unsafe
code only. By cordoning off the implementations of sophisticated
data structures that require <code>unsafe</code>, Rust also cordons off who
has to worry about panic safety.</p>
<p>In C++, every function that calls another function has to be
written in an exception-safe way. In Rust, it’s really only unsafe
code that has to worry about it. This, in my mind, is a huge win,
and it comes from both better scoping of panics, and better management
of the situations where panics can break things.</p>
<h2 id="c-style-vs-c-style-pointers-and-arrays">C-style vs C++-style Pointers and Arrays</h2>
<p>There is a subset of C++ that is almost identical to C, and C++
must maintain compatibility with this subset for tradition’s sake.
It also must maintain compatibility with previous versions of itself.
Between the C and the C++, the concepts contained in C++20 stretch from
1972 to 2020, almost 50 years of active change in programming language
technology. This leads to features being duplicated, but differently,
and in ways that unfortunately clash with each other.</p>
<p>For example: How do you express indirection? How do you alias a
value? There are three different ways to do it, and rather than
breaking down by use case, the biggest difference between them
is era:</p>
<ul>
<li>Pointers, from the original C</li>
<li>References, a newer innovation that attempts to solve some of
the issues with pointers</li>
<li>Smart pointers, an even newer innovation that attempts to cover
some of the remaining use cases. For pointers into arrays, iterators
also cover a lot of the same territory as smart pointers, and can
be lumped together for this conversation.</li>
</ul>
<p>These overlap a lot, and there is no single principle that will
tell you when to use which. You can invent some rules, and come
up with some principled reasonings for them, but your colleagues
won’t necessary listen, and external libraries and other codebases
you have to interact with certainly won’t, not even the standard
library, not even the programming language itself. Fundamentally,
the difference is era.</p>
<p>Nullability? Part of original pointers. Later, we learned it was harmful
and got rid of it in references, but due to issues with how C++ does
<a href="https://www.thecodedmessage.com/posts/cpp-move/">move semantics</a> it comes back with a vengeance for
smart pointers. (Of course, you still <em>can</em> make a null reference,
it’s just undefined behavior. Ah well.)</p>
<p>Pointers and references have special syntax, whereas smart pointers,
because they came from a later era, use the more standard <code>ptr_type<T></code>
syntax. Pointers and smart pointers can be used to manage ownership,
and references should not be.</p>
<p>How should out parameters be expressed? It’s easy to say they should
be expressed with references, because otherwise they’re nullable,
and you have to worry about whether to check for nulls or not. On
the other hand, expressing out parameters with references mean you
can’t tell at the caller whether it’s an out parameter, only
at the callee:</p>
<pre tabindex="0"><code>int foo_ptr(int in, int *out);
int foo_ref(int in, int &out);
int out;
foo_ptr(3, &out);
foo_ref(4, out); // Surprise, this changes `out`! Can't tell, though!
foo_ptr(5, nullptr); // Does this crash? Does this work? Who knows!
// Read the docs, I guess *shrug* hope there are docs
</code></pre><p>References should be used, in my practice and in the practice of many
people I respect, in every case where the reference is not owning,
will not be used for arithmetic, and is not optional. Of course, <code>this</code>
meets all of those requirements, but is a pointer, not a reference
(but a special pointer, where being null is undefined behavior, like a
reference), simply because references were invented after <code>this</code> was,
and for no stronger reason.</p>
<p>Similarly, my practice dictated that <code>std::unique_ptr</code> should be
used for owning pointers. It’s nullable, but at least it auto-frees,
and so you should use it everywhere you’re conveying ownership. And
then, <code>Foo *</code> can be used when you want an optional non-owning
reference. But old APIs and APIs from C exist all over the place
that will use <code>Foo *</code> invariably, and some will use <code>Foo *</code> for
out parameters because of the callee readability issue, or because
of <a href="https://www.youtube.com/watch?v=rHIkrotSwcc">concerns</a> about
<code>std::unique_ptr</code>, or simply out of old habit, meaning you can’t count
on this convention actually being upheld, not at all.</p>
<p>And of course, converting between these different representations is
sometimes as easy as <code>&</code> or <code>*</code>, and sometimes as difficult as having
<code>&</code> and <code>*</code> compile and seem to work but result in memory corruption,
and everywhere in between.</p>
<p>Similarly, <code>T foo[N]</code> and <code>std::array<T, N> foo</code> are different ways of
writing the same basic thing. It gets weird when <code>N = 0</code>, of course;
this is only supported by <code>std::array</code>. And (on compilers that support
it at all), having <code>N</code> be dynamic on the stack is only supported by <code>T foo[N]</code>. And of course, <code>new T[N]</code> returns a raw pointer to <code>T</code> whereas
<code>new std::array<T, N></code> returns a pointer to a <code>std::array</code>, which makes
much more sense.</p>
<p>So, basically, <code>T foo[N]</code> should be completely deprecated, but keeps on
being used even by new C++ programmers because it looks like it
should be the normal way to write an array, and because it looks like
the arrays from C. But they’re completely different types – one isn’t
syntactic sugar for the other.</p>
<p>This gets unwieldy, because the ways with the syntactic sugar (like
<code>new</code> and <code>T*</code> instead of <code>std::make_unique</code> and <code>std::unique_ptr</code>)
are the old, more C-style ways, the ways that yield more memory leaks
(you have to explicitly free or delete a <code>T*</code>) and memory corruption (<code>T foo[]</code> doesn’t even have a safe indexing operation, or proper iterators).</p>
<p>And of course, even if you use the more modern formulations to save
on cognitive load because they’re more consistent with the rest of
the programming language (where <code>std::unique_ptr</code> does RAII unlike
traditional pointers (spelled <code>*</code>) and <code>std::array</code> implements the
expected collections methods unlike traditional arrays (<code>[]</code>)), you still
have to understand the traditional pointers and arrays completely to call
yourself a C++ programmer. Due to C interop, people not changing their
ways, and old resources, lots of new code is still written with them,
and there are still situations where they’re unavoidable, like pointer
arithmetic or <code>this</code>.</p>
<p>Besides, even if you do correctly discern that a <code>T*</code> must be freed,
how do you free it – <code>free</code>, <code>delete</code>, or <code>delete[]</code>? Choose wisely,
because the consequences of mixing <code>malloc</code> and <code>delete</code> can go beyond
whether destructors are called, and lead to undefined behavior and
general memory corruption. The documentation (or lack thereof), however,
might just assume you know which one to call.</p>
<h3 id="how-rust-handles-this-2">How Rust Handles This</h3>
<p>Rust also has references and various types of smart pointers and
iterators. It also has raw pointers, from which smart pointers
can be implemented. So in terms of the range of features, it’s
actually the same as C++. What’s the difference then?</p>
<p>Well, in Rust, the difference is that they don’t overlap in the
same way. Each feature has its own purpose, unlike in C++ where it’s
anyone’s bet whether references or pointers are used for aliasing or
pass-by-reference, or whether raw pointers or smart pointers are used
to express ownership. Nullability is mostly a separate concern from
day-to-day use of Rust’s types, and so it is implemented orthogonally
through <code>Option</code> and <code>Result</code>, rather than being available in some types
but not in others haphazardly.</p>
<p>References are for everyday aliasing and
pass-by-reference. They are not nullable. They represent
the primitive concept of aliasing and pass-by-reference, and
they are the only feature that does so. Unlike C++ references,
you must use the <code>&</code> operator to create a Rust reference, making
them explicit on the caller.</p>
<p>Smart pointers are, for the most part, also not nullable, possibly
partially because Rust has <a href="https://www.thecodedmessage.com/posts/cpp-move/">destructive moves</a>.
They represent ownership semantics – whether “unique” ownership (<code>Box</code>),
shared ownership (<code>Rc</code> or <code>Arc</code>), or locking (<code>Mutex</code> or <code>RefCell</code>).</p>
<p>Raw pointers in Rust are very special – they are for implementing
smart pointers or other low-level data structures. They are for
situations where the structure of memory and the concept of a pointer
is actually key to the situation. They are kept within these narrow
bounds, and outside of everyday application programming, by having
most of their features considered <code>unsafe</code>.</p>
<p>If only that could be done for raw pointers in C++! But there is too
much momentum behind the C++ raw pointer.</p>
<h2 id="dynamic-vs-static-polymorphism">Dynamic vs Static Polymorphism</h2>
<p>This is the most intense one, and could be a blog post all
on its own – and probably I’ll write it one day.</p>
<p>In response to comments, I’m going to add a caveat here even though
I address it later: In this section, I’m discussing the status of C++
pre-concepts, from C++17 and earlier, because that is the form of C++
that most people are still using, and that the vast majority of code is
still written in. It is too early to tell how much concepts will help,
but because they are an optional feature, I’m not at all optimistic.</p>
<p>We have two forms of polymorphism in C++, two very different
systems. One is a Turing-complete macro system that comprises
overloads, templates, and template metaprogramming. The other
is an object-oriented style system of polymorphism through
inheritance.</p>
<p>They were designed with different purposes in mind, and considering
their original purpose, it’s clear to see why they must have
different implementations.</p>
<p>Templates were designed for collections and algorithms, for being
able to write a vector or linked list that could contain any arbitrary
type, without resorting to a C-style <code>void*</code> that would require both
indirection and type erasure. The lack of indirection is the point –
at least it was for C++ – and so as a consequence templates had to be
carried out statically.</p>
<p>Dynamic polymorphism, on the other hand, was designed for OOP design
patterns. As such, in line with OOP principles, it supports heterogeneous
containers, especially necessary to support OOP’s core use case of GUI
programming.</p>
<p>But in spite of this deep contrast between static and dynamic, they
overlap in use case. For example, Smalltalk, Objective-C, and
Java (pre Java 5) all show us that you can use dynamic polymorphism
to implement generic containers. If C++ had been less performance-centric,
and could tolerate the indirection, it could have used a similar strategy,
the (old school) Java approach to generic containers without generics
or templates:</p>
<ol>
<li>
<p>Make all classes inherit from a universal base class, <code>Object</code>.
This way, <code>Object *</code> (just <code>Object</code> in Java) can refer to any object.
Make sure, for C++, that this has a virtual destructor, so you can
delete any object through its <code>Object*</code> handle.</p>
</li>
<li>
<p>Write “boxed versions” of all primitive types, classes that
extend <code>Object</code> to correspond to <code>int</code> and <code>double</code>, etc.</p>
</li>
<li>
<p>Write all collection classes (<code>std::vector</code>, <code>std::list</code>) in terms
of <code>Object *</code>, writing <code>Object *</code> instead of <code>T</code>.</p>
</li>
<li>
<p>Use RTTI and <code>dynamic_cast</code> (or in Java terms, casts) to allow
the user to get whatever object type they want out of them.</p>
</li>
</ol>
<p>Voilà! You can now store anything in your collections without need
for generics or templates, using <code>dynamic_cast</code>, an obscure feature of
the OOP-style dynamic polymorphism that C++ has. And this system is in
fact still the basis of Java generics, and so we can project that C++
would have used something similar if performance weren’t a concern and
indirections and RTTI were acceptable.</p>
<p>So that shows the overlap between templates and runtime polymorphism
in a theoretical sense, but do these very differently implemented
features in fact overlap in practice?</p>
<p>I’ve seen skepticism. I once interviewed
people for a job, and I asked candidates to explain to me the
similarities and differences between dynamic polymorphism and
templates. The candidate said there was no overlap; templates
were for generic programming (e.g. collections and algorithms and STL),
and dynamic polymorphism was for object-oriented programming.</p>
<p>But they do overlap in practice. I know, because I spent a lot of
time transitioning object-oriented dynamic code into static form,
and teaching the static equivalents to dynamic polymorphism patterns.
It wasn’t easy, because even though the overlap is huge, the
semantics are vastly different.</p>
<p>Let me give an example. Let’s start with one of my favorite patterns:
the policy pattern. Let’s imagine we have a function that sends messages
in a way that can fail, and let’s also imagine that we have a policy
that indicates how we should delay and retry sending this message. I’ll
start out writing it the object-oriented way, something like this:</p>
<pre tabindex="0"><code>struct RetryPolicy {
virtual bool should_retry(mesg_send_err_t error_code) = 0;
virtual uint32_t delay_microseconds() = 0;
};
mesg_send_err_t retry_send_message(Message &mesg, RetryPolicy &policy) {
while (true) {
auto err = send_message_once(mesg);
if (err == mesg_send_err_t::SUCCESS) {
return mesg_send_err_t::SUCCESS;
} else if (!policy.should_retry(err)) {
return err;
} else {
usleep(policy.delay_microseconds());
}
}
}
</code></pre><p>The policy can then do things like “retry 5 times, waiting 0.01 seconds
between each retry” or “exponential back-off, so that each retry waits
twice as long as the previous.” It can also deem certain errors as
fatal, but others as worth sleeping and retrying for. Here’s
an example of using this interface:</p>
<pre tabindex="0"><code>struct WaitOneSecondAndTryFiveTimes : RetryPolicy {
int retry_count = 0;
bool should_retry(mesg_send_err_t error_code) override {
if (error_code == mesg_send_err_t::MALFORMED_MESG) {
return false;
}
retry_count++;
if (retry_count == 5) {
return false; // do not retry
}
return true; // do retry
}
uint32_t delay_microseconds() override {
return 1000000;
}
};
WaitOneSecondAndTryFiveTimes policy;
auto err = retry_send_message(mesg, policy);
</code></pre><p>Now, it turns out we can do this exact same pattern with static polymorphsim.
The callee code now looks like this:</p>
<pre tabindex="0"><code>template <typename T>
mesg_send_err_t retry_send_message(Message &mesg, T policy) {
while (true) {
auto err = send_message_once(mesg);
if (err == mesg_send_err_t::SUCCESS) {
return mesg_send_err_t::SUCCESS;
} else if (!policy.should_retry(err)) {
return err;
} else {
usleep(policy.delay_microseconds());
}
}
}
</code></pre><p>This is no longer a function. It is a function template, which is a type
of macro. Its implementation must now move from the <code>.cpp</code> file to the
<code>.h</code> or <code>.hpp</code> file, for reasons that only make sense if you think about
how the programming language is implemented.</p>
<p>No longer is the policy interface spelled out separately. The only
thing the function signature says about the type of <code>policy</code> is
that it is <code>T</code> – which can be any type. Only in the implementation,
in the body, do we see that <code>should_retry()</code> and <code>delay_microseconds()</code>
must be implemented on it. This is an implicit interface, defined by
usage, very similar to Python and Ruby’s <a href="https://en.wikipedia.org/wiki/Duck_typing">duck
typing</a>. More importantly,
it is completely unrelated to the OOP-style explicit interface
using inheritance and virtual functions.</p>
<p>The errors are completely different, because the rules are completely
different.</p>
<pre tabindex="0"><code>test.cpp:57:34: error: variable type 'WaitOneSecondAndTryFiveTimes' is an abstract class
WaitOneSecondAndTryFiveTimes policy;
^
test.cpp:22:22: note: unimplemented pure virtual method 'delay_microseconds' in 'WaitOneSecondAndTryFiveTimes'
virtual uint32_t delay_microseconds() = 0;
^
1 error generated.
</code></pre><p>With the template version, you get:</p>
<pre tabindex="0"><code>test.cpp:34:27: error: no member named 'delay_microseconds' in 'WaitOneSecondAndTryFiveTimes'
usleep(policy.delay_microseconds());
~~~~~~ ^
test.cpp:59:16: note: in instantiation of function template specialization 'retry_send_message<WaitOneSecondAndTryFiveTimes>' requested here
auto err = retry_send_message(mesg, policy);
^
1 error generated.
</code></pre><p>The ad-hoc nature of template requirements should not be underestimated.
It means that objects that are designed to work with a whole library might
only work with the exact combinations of functions they’ve been used
with so far. It means that documentation, if it wants to be rigorous,
must do the work of defining the protocols itself of every argument
taken by every function. It means that it’s not clear when you’re
putting new requirements on arguments to a function, as there is no
warning and no clear red-line step to tell you that you’re breaking
backwards-compatibility.</p>
<p>Concepts have been introduced recently to clean it up, and I think
it’s still early to tell how good a job they will do. But even if
they do a great job, the polymorphism will still look very
different from the OOP style, and the old template-based code
will still exist, and so in the meantime the C++ programming language
has simply continued to grow.</p>
<p>And the concrete consequences: It’s a perfectly reasonable decision to
use OOP-style polymorphism, for the benefits of cleaner structure and
explicit specification of the interface, even when the dynamic nature of
the polymorphism – and its concomittant performance costs – is never
actually called for. Meanwhile, using static polymorphism to accomplish
the same goals is simply harder, requiring much more skill and training.</p>
<h3 id="how-rust-handles-this-3">How Rust Handles This</h3>
<p>Like C++, Rust has both static (compile-time) polymorphism, and dynamic
(run-time) polymorphism. Unlike C++, Rust integrates them closely into
a single feature, inspired by Haskell’s typeclasses: traits.</p>
<p>Let’s use the same example again, but in Rust, using static polymorphism,
which is the more Rusty way to write such a function:</p>
<pre tabindex="0"><code>trait RetryPolicy {
// Return None to not retry at all
// Takes `self` as `&mut` to implement counting and back-off
fn retry_microseconds(&mut self, error: MesgSendError) -> Option<Duration>;
}
fn retry_send_message(mesg: &Message, mut policy: impl RetryPolicy) -> Result<(), MesgSendError> {
loop {
match send_message_once(mesg) {
Ok(()) => {
return Ok(());
}
Err(err) => match policy.retry_microseconds(err) {
None => {
return Err(err);
}
Some(delay) => sleep(delay),
},
}
}
}
</code></pre><p>I changed the example a little to showcase some other differences with
Rust. Instead of querying two functions, for example, to know whether
to try again and how long to delay, I feel in Rust it is more natural
to use sum types (and in particular <code>Option</code>) to fold them into a single
function. Similarly, rather than a <code>u32</code> count of microseconds,
<code>std::thread::sleep</code> takes a <code>Duration</code>, and so I felt the policy trait
should reflect that as well. Also, last but not least, in Rust it is
not necessary to consider <code>SUCCESS</code> to be one of the error options,
and so the types are more well-honed to the situation.</p>
<p>Notice, however, that this is the more performant static version,
and it has an explicit in-code specification of what the interface
is for the policy. However, the policy code and the generic code
are fully integrated just like in the C++ templated version,
through a process known as monomorphization. Fundamentally,
monomorphization exhibits the same behavior to C++ template
instantiation, but in a more principled, constrained fashion.</p>
<p>Here is the example of the usage of such a polymorphic function:</p>
<pre tabindex="0"><code>struct WaitOneSecondAndTryFiveTimes {
retry_count: u32,
}
impl WaitOneSecondAndTryFiveTimes {
fn new() -> Self {
Self {
retry_count: 0,
}
}
}
impl RetryPolicy for WaitOneSecondAndTryFiveTimes {
fn retry_microseconds(&mut self, err: MesgSendError) -> Option<Duration> {
if err == MesgSendError::MalformedMessage {
return None;
}
self.retry_count += 1;
if self.retry_count == 5 {
return None;
}
Some(Duration::from_secs(1))
}
}
let policy = WaitOneSecondAndTryFiveTimes::new();
let res = retry_send_message(&mesg, policy);
</code></pre><p>If we wanted to use dynamic polymorphism for some reason – for example,
if we wanted to look the policy up in some sort of map based on a
user-supplied keyword, or load the policy from a dynamic library – we
could, easily.</p>
<p>Unlike in C++, barely anything has to change. In fact,
only three lines have to change.</p>
<p>The function signature has to change, to indicate that it’s using
dynamic polymorphism now. Dynamic polymorphism, due to its nature,
can only be done through indirection, so we have to add that (though
it does not affect the function body):</p>
<pre tabindex="0"><code>fn retry_send_message(mesg: &Message, policy: &mut dyn RetryPolicy) -> Result<(), MesgSendError> {
</code></pre><p>Similarly, the call site has to change, to implement the indirection:</p>
<pre tabindex="0"><code>let mut policy = WaitOneSecondAndTryFiveTimes::new();
let res = retry_send_message(&mesg, &mut policy);
</code></pre><p>And that’s it! Now it’s dynamic polymorphism!</p>
<p>When I first saw this is when I was truly convinced that Rust would
eclipse C++.</p>
<h1 id="discussion-and-conclusion">Discussion and Conclusion</h1>
<p>So, assuming I’ve convinced you that Rust has a better organized
feature-set than C++, we have to discuss what, in the big picture,
C++ has done wrong and Rust has done right.</p>
<p>The first and most obvious thing Rust did right was learn from the
mistakes of the past. Each new version of C++ has to be compatible with
previous versions to a great extent, including (in a lot of ways) C,
giving it a legacy back into the early 70’s. Rust started maintaining
compatibility in 2015, and so it’s only had 7 years or so of cruft,
but knew about all of C++’s later add-ons from the beginning.</p>
<p>And one of the things Rust learned from the experience of others is
how to mitigate this effect, so we can hope Rust retains its youthful
freshness for longer going forward. Rust has an edition system, so
that features actually can be deprecated and phased out, while still
maintaining compatibility.</p>
<p>But also, Rust’s goal of separating safe and unsafe features – and
keeping unsafe code encapsulated using the <code>unsafe</code> keyword – forces
Rust’s feature set to be more coherent. If two features clash in C++,
the standards committee can put the work of reconciling them on the
programmer, but in Rust, they often have to do the work to make
them make sense together, so they can continue to guarantee that
safe code can’t cause undefined behavior.</p>
<p>Additionally, Rust believes in, and has, invariants. In C++, some
structs can be trivially copied. In Rust, all data types can
be trivially moved (<code>Pin</code> is almost but not quite an exception,
and the work that went into making <code>Pin</code> not break everything
shows how important the invariant is.) In Rust, a mutable reference
always means that a block of code has exclusive access to a value.
These invariants also structure other features, and force them
to work in concert.</p>
<p>Enough about Rust, though. I think there’s deeper lessons to be learned
from the flaws in C++. Bjarne Stroustrup famously said, “Within C++,
there is a much smaller and cleaner language struggling to get out.” I
think he regrets the quote, which he clarified is about the modern
semantics of C++, held down by the outdated syntax of C. It’s such a
compelling quote, though, because C++ is so messy and dirty, so we want
to believe in a small clean underlying core.</p>
<p>The truth, however, is that there isn’t <em>one</em> smaller and cleaner
programming language struggling to get out. There’s multiple. And
from the beginning, C++ was a glomming-together of multiple ideas: a
Simula-like OOP system glued awkwardly to C, without a unified motivating
vision. Operator overloading was intended to help integrate the two
parts, which it did but at the expense of creating its own entire
sub-paradigm. And then came templates, which tried to add generic
containers and algorithms but unexpectedly exploded into their own
programming paradigm.</p>
<p>So inside C++, struggling to get out, is of course C, the original
“portable assembly,” which does its very simple job well. There’s also
Java/C# in there, if we take the OOP features on their own. For the
operator overloading and RAII and templates, the closest I can really
imagine is Rust, which I think if Bjarne was being fair, he would have
to admit is close to what he specified when he clarified his quote:
Rust does emphasize “programming styles, libraries and programming
environments that emphasized the cleaner and more effective practices
over archaic uses focused on the low-level aspects of C.”</p>
<p>It’s understandable that Bjarne glommed OOP, a foreign paradigm, onto
the otherwise-stable base of C. OOP was extremely popular for a long
time, and has been awkwardly glommed on to many programming languages,
and I think Rust benefits from not even trying to be an OOP language in
the traditional 3-pillar sense (Rust doesn’t have inheritance at all,
and has non-OOP concepts of encapsulation and polymorphism).</p>
<p>C++ wasn’t even the only programming language to result from glomming
on object-oriented programming to C, and of the two big ones, it is
the more coherently integrated. Objective-C comes from a more dynamic
tradition of object-oriented programming, and it really feels like
two programming languages glued together, in this case C and Smalltalk.</p>
<p>I programmed Objective-C professionally for a while, and most of the
time, the C only came out when you had to do a little bit of pure logic
outside of the object-oriented framework. In the meantime, all of the OOP
code had to be written using the little whisps of syntax C left behind,
especially <code>@</code>, which basically served as a sigil to indicate that what
followed was to be interpreted in an Objective-C way … which in an
Objective-C codebase basically should have been the default.</p>
<p>At the time, I dreamed of the leaner programming language inside
Objective-C (the non-C one), and even started designing a Smalltalk
dialect designed to interact with Apple’s Cocoa APIs: CocoaTalk, I
think it was called. Ultimately, Apple unveiled their concept of it,
sharing many ideas with Rust, known as Swift. I felt very vindicated
the day Swift was announced.</p>
<p>Rust is C++’s chance to get a leaner, cleaner programming language.
The syntax is heavily influenced by C++, even as the semantics come
from a variety of sources. The design was done <em>de novo</em> with guiding
principles that allowed all of C++’s vast repertoire of features to
be reimagined but working in concert with each other. As someone who
used to love programming in C++, which enabled programming techniques
no other programming language could, I continue to be deeply impressed
by the feature design of Rust.</p>
A Checklist of Dev-Ops Disciplineshttps://www.thecodedmessage.com/posts/process-checklist/2022-05-09T00:00:00+00:00I have worked on a lot of programming projects in my time, and while I was a programming consultant I have worked in a lot of different corporate environments. At some of them, it was easy to be concretely productive: I was able to contribute immediately, and at a rapid rate. At others, actual useful contributions would be impossible until I had a month or more of experience with a codebase, and even then every change would be a long slog.<p>I have worked on a lot of programming projects in my time, and while I
was a programming consultant I have worked in a lot of different corporate
environments. At some of them, it was easy to be concretely productive:
I was able to contribute immediately, and at a rapid rate. At others,
actual useful contributions would be impossible until I had a month or
more of experience with a codebase, and even then every change would be
a long slog. The difference can be overwhelming and palpable.</p>
<p>The biggest contributors to this difference wasn’t what programming
language was chosen (though I do <a href="https://www.thecodedmessage.com/posts/hello-rust/">care a lot</a> about
that), nor how well the code was factored (though that’s also very
important), but rather the organizational structure that surrounded
the code: the build system, the repo configuration, the tests, the
documentation, the ticketing system – the stuff outside of the code
itself that was essential to how programmers interacted with the code.
Most, but not quite all, of what I’m talking about falls under the header
of <a href="https://en.wikipedia.org/wiki/DevOps">Dev Ops</a>.</p>
<p>After having read (some of) <a href="https://www.amazon.com/Code-That-Fits-Your-Head/dp/0137464401/"><em>Code That Fits in Your
Head</em></a>, I
have come to believe in the importance of check-lists, so I’ll share
with you my personal check-list of important dev-ops and dev-ops
adjacent considerations when setting up a new project, so that
developers can work rapidly, effectively, and with fewer mistakes.</p>
<p>The stakes are high – if it takes forever to make a change, if the
process between modifying your code and running your code is too long,
programmers won’t be able to work unless they’re much more confident,
biasing them towards overly simple fixes and against more complicated
refactors. If there’s no tests, programmers will be overly careful
modifying the code to avoid breaking things, and so the code won’t
be able to evolve. New team members will take much longer to gear up,
and everyone will be much less productive.</p>
<p>The worst thing is, managers are liable to dismiss developers’ complaints,
and developers are unlikely to have the confidence to raise them. It’s
easy to be unsympathetic to the complaint that a job is tedious or
inconvenient. It sounds to many programmers and managers alike like
laziness, and the obvious answer is “Well, that’s why we pay you the
big bucks.” Obviously development at these shops is still possible,
and the old hands at the company, who are used to whatever system’s in
place, have accepted the costs already.</p>
<p>But make no mistake: Developer convenience and happiness is closely
connected to developer productivity and accuracy. So let’s discuss
how to make a development environment convenient for a developer.</p>
<p>So here’s how we can make programming convenient, as a check-list with
some explanation for each item. Many of these items I learned from
colleagues and leaders along the way in my programming career; this
is my first attempt to collect all of them.</p>
<h1 id="development-environment">Development Environment</h1>
<ul>
<li>Let programmers use their own preferred development environment</li>
</ul>
<p>Many developers have life-long habits and long-accumulated configurations
for their favorite editors. I know I do! Standardizing IDEs or even
operating systems can be tempting, but in general it isn’t worth
it. Programming includes a lot of little steps, and making all of them
take longer by changing a programmer’s environment can destroy momentum.</p>
<ul>
<li>Provide standards for developer workstations</li>
</ul>
<p>This might seem to contradict the previous point, but I honestly think you
need both. It should be super easy to figure out what kind of operating
system requirements and dependencies are necessary to build all the
projects, because, as we’ll get to soon, developers should be able to
build projects locally.</p>
<p>For example, most Linux distributions are customizable enough that
programmers will be able to find a development environment within that
distribution that suits them. The dependencies of a project can then be
specified as a package list within that distribution, but the developers
can then customize the rest of their interface. Commonly used distributions
should be preferred if developers are doing their own IT, so that they
can easily find help online.</p>
<h1 id="build-system">Build System</h1>
<p>You’ve changed a line of code. Congratulations! Now how long will it be
until you can see the results of your change in action? How many steps
do you have to take to see if it fixed your issue? If it broke compilation?
If it passes tests?</p>
<p>If the amount of time or number of steps is low, then people will be able
to try out various solutions, use trace statements to debug issues, and
otherwise interact with their code like a live system. If it’s high,
they have to rely more on their own reasoning, which is fallible, more
likely to lead to bugs, and more likely to lead to timid, overly-conservative
changes, that work around problems rather than addressing them.</p>
<p>So how do we accomplish this?</p>
<ul>
<li>Projects should run natively and directly on developer workstations</li>
</ul>
<p>In my mind, this is almost a deal-breaker for development. If you have
to deploy to a dev environment or install on a physical piece of embedded
hardware to test your software, your dev cycle will be far too long. Dev
environments and physical hardware are of course essential for testing,
but using them for absolutely all development introduces resource
constraints where there don’t have to be, and lengthens dev cycles.</p>
<p>Even if the local dev environment is different from the prod environment,
that’s fine. Even if some of the code won’t run and make sense, it’s still
important to be able to run the rest of code locally. Even if it’s running
on an embedded platform and operating system with no proper simulator,
some of the code will work on Linux or macOS. Those components should
be testable on the developer workstation itself.</p>
<ul>
<li>Building a project locally should be a single command</li>
<li>Building and running automated tests for a project should be a single command</li>
</ul>
<p>When I say a single command, I mean it. Exactly one. Two is far too
many. Once you try it, you’ll never go back. If your workplace doesn’t do
this, write a script. Check the script in.</p>
<p>Of course, if different developers have different computers, this
might be difficult, but if you assume a standard set of dependencies,
(or use a reproducible build system like NixOS), this command can
just be an invocation of the build system.</p>
<p>In situations where it’s more complicated, a shell script should be
written to encapsulate the complexity. This shell script should be included
in the repo and maintained and checked by CI along with the other code,
so that it always works. The exact invocation of the command should
be completely invariate, and documented in the projects <code>README.md</code> file.</p>
<p>Programmers don’t need to be distracted by complicated multi-part
instructions that haven’t worked exactly right in years, or that work
on some machines and not others or by twiddling with their Docker
settings. They should be focused on actually improving and fixing
code.</p>
<ul>
<li>Building and running a project locally should be a single command</li>
</ul>
<p>This is similar to the above but might require sample configuration to
be checked in along with the repo.</p>
<ul>
<li>Builds should be reasonably fast</li>
</ul>
<p>Developers should program on sufficiently powerful computers for their
builds. Build scripts should use options like <code>-j</code> and if helpful send
builds seamlessly to build farms (the seamlessness is important; it
should still be a single command for the developer and result in a local
build and run). Private caches should be set up, if this is possible
with your build system.</p>
<p>If programming in C or C++, header file hygiene can be
an important consideration in build speeds – invest time into it.
Use incrementality features of your build systems rather than having
scripts that clean every time. Structure the code so that incremental
builds are possible.</p>
<p>If necessary, allow developers to build only part of the project (while
still making it simple to build the entire project).</p>
<h1 id="version-control">Version Control</h1>
<ul>
<li>Use version control for all projects</li>
</ul>
<p>This is hopefully obvious to all modern teams, but I wanted to make sure
I said it anyway to talk some about why it’s important.</p>
<p>The first and more obvious upshot of version control is being able
to undo and research mistakes. If the code changed how it works, developers
should be able to ask “when did it break” before asking “how did it break.”
If the changes in the log are fine-grained enough, this might prevent the
need for investigating the “how.” (Note that bisecting often requires
fast dev turn-around as well – these are interconnected.) Version
control should always be used. Even informal, one-person projects, such
as writing test programs to try out APIs, should exist within a
version-controlled repo.</p>
<p>The second upshot is that it enables collaboration. This also makes it
important even for very small projects, because it enables you to easily
ask your colleagues for help, and your colleagues can then look at the
code with their own preferred development environment and try out fixes
on their own machine.</p>
<ul>
<li>Developers should be proficient in Git</li>
</ul>
<p>It’s not enough to <a href="https://xkcd.com/1597/">cargo cult</a> Git knowledge
or focus on that “one guy who understands Git.” Everyone should put the
effort in to be that “one guy.” If you don’t know what “reflog” means
or how rebasing differs from merging or how to edit commits deep in
the history, you’re not a sufficiently proficient git user. Many, perhaps
even most, programmers aren’t.</p>
<ul>
<li>Use and Enforce a Branching Discipline</li>
</ul>
<p>Even on relatively small projects, no one should be committing and pushing
directly to the <code>trunk</code>/<code>main</code>/<code>master</code> branch. If people push directly
to <code>master</code>, every commit is automatically collaborative. This will make
developers commit less frequently than they otherwise should, and will
decrease the effectiveness of version control by having fewer versions
to go back to.</p>
<p>It will also, obviously, lead to people accidentally “breaking the build”
as projects get bigger. Committing a small change and merging that
change into <code>trunk</code> or <code>develop</code> should be two different actions. The
first should be done extremely often, and the second should only be
allowed if a certain number of hoops have been jumped through.</p>
<ul>
<li>Enforce CI</li>
</ul>
<p>Before code can be merged into master, it should build. By default, merging
into master should be impossible unless the repository has verified the
build with CI. This is where we can easily test that it builds in a
deployment setting in addition to a development setting, where
artifacts can be created to deploy to servers or embedded devices
(though this should also be possible to do locally) and where
we can run automated tests. Coding standards should be enforced here,
through lints. <code>clippy</code> and <code>cargo fix</code> are great tools for Rust.</p>
<p>Ideally, your CI scripts should be checked into the same repo as the systems
they test, as is supported by GitLab with its <code>.gitlab-ci.yml</code> files.</p>
<ul>
<li>Have tests in the repo</li>
</ul>
<p>This is related. I’m not going to go into how to write tests and test
coverage and all of that here; that’s again a separate topic for many
many books. But there should be tests, and the important tests should
be in the repo, and they should automatically be run by CI.</p>
<p>Remember: Tests aren’t just a tool for making sure the developers didn’t
mess up after a fact. They’re there so developers can make sweeping
changes with confidence.</p>
<ul>
<li>Avoid mono repos</li>
</ul>
<p>This one’s simple: The git log is too spammy and CI for the whole thing
takes too long to run. Also, we have the technology of submodules, or,
if on Nix, <a href="https://github.com/obsidiansystems/nix-thunk"><code>nix-thunk</code></a>.</p>
<ul>
<li>Require code review</li>
</ul>
<p>This should be enforced by your Git system. As for how to actually do code
review, this is a big enough topic to be its own section, which is coming up.</p>
<h1 id="code-review">Code Review</h1>
<p>The main point of code review is not to make sure bugs don’t get into
the code, although it helps with that. The main point of code review is
to mitigate <a href="https://en.wikipedia.org/wiki/Bus_factor">bus factor</a>, that
is, to make sure there’s more than one person who is ready to maintain
the code. All other guidance flows from here.</p>
<ul>
<li>At least the person who maintains the code should also review</li>
</ul>
<p>If the MR is written by the primary maintainer of the codebase,
it should reviewed by whoever would have to step up if they
were abruptly “hit by a bus.”</p>
<p>This ensures that everyone maintaining the code is in agreement
with not just style and correctness concerns, but in the general
design, architecture, and organization of the code.</p>
<ul>
<li>The standard should be “Would I take responsibility to maintain this?”</li>
</ul>
<p>If the answer is no, why not? Asking myself this question motivates me
to make more suggestions about how the code should be factored, so I
can jump in and make changes easily like I can with my own codebases,
rather than just simply verify that it looks like it works and doesn’t
have any <code>unwrap()</code> calls.</p>
<p>This question leads to some natural sub-questions:</p>
<ul>
<li>How hard is it to find bugs in?</li>
</ul>
<p>It shouldn’t just not have bugs, it should be obvious it doesn’t
have bugs. This way, when a bug is actually discovered, code that isn’t
buggy but is complicated won’t distract the poor developer trying to find
the cause.</p>
<ul>
<li>How hard is it to modify to do something else?</li>
<li>How easy is it to mess up?</li>
</ul>
<p>This is where DRY (don’t repeat yourself) comes in. If I repeat
the same pattern of code more than 2 times, and someone modifies it, they
might only modify some of the instances of the pattern. This can
also be mitigated not through abstraction but by putting all the instances
next to each other, which is sometimes appropriate.</p>
<p>The code, however, should also not do premature abstraction, because
then it will be impossible to find issues among all the spaghetti of
function calls and variable references, so this is a balancing act.</p>
<ul>
<li>If a bug is found to be caused by this change, will we know which
part to revert?</li>
</ul>
<p>Remember, programmers should be able to
<a href="https://git-scm.com/docs/git-bisect">bisect</a> instead of having to read
an entire codebase when they want to find a bug. If you found out that
the bug was caused by this change set, would you be relieved to know or would
you still have a lot of work ahead of you?</p>
<h1 id="documentation">Documentation</h1>
<p>Last but not least, documentation.</p>
<ul>
<li>Documentation should say how to build the project</li>
</ul>
<p>It should, as mentioned, be one command, and it should not depend on very much
set up beyond “having a standard development workstation.”</p>
<ul>
<li>Documentation should say how to run the project</li>
</ul>
<p>What flags or configuration does it take? How do you tell it to re-read
the configuration? Does it use any environment variables?</p>
<ul>
<li>Documentation should say what the project <a href="https://www.thecodedmessage.com/posts/buried-lede">is for</a></li>
</ul>
<p>This should be before how to build it and run it, and should
explain who might want to run it and where it fits into the broader
organization, and the first things a programmer might want to know
before looking at it. This will help people understand the stakes
of modifying it, and where to start looking for features. This
should be covered in the lede paragraph.</p>
<p>Which leads me to:</p>
<ul>
<li>There should be a lede paragraph</li>
</ul>
<p>This should introduce the repo to someone who’s never heard of it
and doesn’t have any context for what they’ve stumbled across. It should
include its role in the company’s tech stack, its status, and what
technologies it uses.</p>
<p>Here’s some examples:</p>
<blockquote>
<p>This is the main repo for our flagship product, and it is one of
our few repos that is not open source.
Customers use it directly to control the widget machines, which it
contains all the drivers for, and also Node.js code to serve
the user-facing web interface.</p>
</blockquote>
<blockquote>
<p>This is run as a twice-daily batch job to automate pruning the
widget description files. It is run on customer machines, and is
open source as local administrators might want to customize it.
It is written entirely in Perl 4 except for one module that is
written in APL. Sometimes, it doesn’t work correctly, and we have
to manually run an earlier version written in JCL and Cobol (link).</p>
</blockquote>
<blockquote>
<p>This implements the new DSL for widget description. Currently, it only
supports translation to old widget descriptions, but it is hoped that
it will eventually be integrated into the main repo. It is a research
project still under active development. It is written in Haskell
and Idris, and contains, as a component, a custom Prolog interpreter.</p>
</blockquote>
<ul>
<li>Documentation should be discoverable</li>
</ul>
<p>It should either be in the <code>README.md</code> of the relevant repo or linked
to directly from there.</p>
<h1 id="ticket-systems">Ticket Systems</h1>
<p>I guess I lied when I said that documentation was last. Project management
is, I think, a topic for a different blog post, but what I wanted to
say about this is: It should be <em>very easy</em> to add a new TODO item that
the programmer doesn’t have to remember anymore. If it takes too long to
make a ticket, developers will lose their flow on the project they were
trying to work on, or will produce fewer tickets, in a bad way.</p>
<p>Ideal is “type a single sentence and press a single button” either
in web or (preferably) command line. The resultant TODO items can then
be fleshed out in a separate grooming meeting.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Paying attention to these things is a bigger multiplier on developer
productivity than finding “10x developers,” and is essential for
attracting and retaining good developers. Improving these things is hard,
especially at organizations that are set in their ways, but it is
far more important than it might look. Dedicated dev-ops professionals
are essential in such things.</p>
God grant me patience... and I want it RIGHT NOW!https://www.thecodedmessage.com/posts/patience/2022-04-20T00:00:00+00:00I’ve been feeling recently like I’ve been spinning my wheels in my personal life. I’m pressing on the metaphorical accelerator as hard as I can, probably too hard for safety, and instead of moving forward, the wheels are just spinning, spinning, spinning. I think a large part of it is my perspective of time. “Time is canceled,” my friends and I would say continuously during the lockdown. And it isn’t back, not yet, not how it used to be, not for me.<p>I’ve been feeling recently like I’ve been spinning my wheels in my
personal life. I’m pressing on the metaphorical accelerator as hard
as I can, probably too hard for safety, and instead of moving forward,
the wheels are just spinning, spinning, spinning. I think a large part
of it is my perspective of time. “Time is canceled,” my friends and I
would say continuously during the lockdown. And it isn’t back, not yet,
not how it used to be, not for me.</p>
<p>I would be far from the first to note the disconnect between
the literal, constant, inexorable progression of time in a physical
sense, and the <a href="https://www.smbc-comics.com/index.php?db=comics&id=2020#comic">wacky way in which we remember
it</a>.
As Groucho Marx (apocryphally) framed it:</p>
<blockquote>
<p>“Time flies like an arrow; fruit flies like a banana.”</p>
</blockquote>
<p>When my friends and I would say “Time is canceled,” we meant it as
a joke. But like many jokes, it was literally true; it was true of
subjective time. While objective time is a physical property of the
universe to be studied by scientists, subjective time is built out of
rituals and milestones. Objective time is measured on clocks and dilated
by near-light speed travel and gravity; subjective time is measured in
hearts and minds and dilated by activities and events and locations.</p>
<p><em>Day</em> isn’t just when our section of the earth faces the sun; it’s when
we’re in the city, at or near the office. <em>Night</em> isn’t when we’re in
the earth’s shadow; it’s when we’re at home, or at a bar. <em>Weekend</em>
is not just a legal construct or a mark on a calendar, but it’s made
out of brunches, mimosas, and daytime outings with friends, while still
in Brooklyn.</p>
<p>Or at least that’s what these words meant before COVID. Once COVID hit,
and the lockdowns came, all of these manifestations of time congealed
into an undifferentiated gray goop. Subjective time was canceled,
replaced with something incoherent. Time was simultaneously
slow and fast: slow, because the beginning of COVID seemed
infinitely long ago, as it felt like I had been starved of social
contact for my entire life; and fast, as months flew by without
commutes and outings and brunches and clubs and churches or
anything at all to break up the gray goop of continuous apartment,
apartment, and more apartment.</p>
<p>Then, eventually, after an infinite amount of time had passed in a few
months, the restrictions started to lift. Gradually, and then suddenly,
there was so much stuff to do. And so I experienced the summer camp effect.</p>
<p>The summer camp effect (not to be confused
with, but perhaps related to, <a href="https://www.urbandictionary.com/define.php?term=Summer+Camp+Syndrome">summer camp
syndrome</a>)
is one of my favorite examples of subjective time. As I can’t find it
via Googling, I think my friends might have discovered it from experience
and first principles. The effect goes like this: When you were a child,
<a href="https://www.lutherancamping.org/nawakwa/">summer camps</a> lasted one week
(leastways they did for me). But during a camp experience, you would
make new friends, have a temporary best friend and rivalries, if you’re
older maybe even a “camp girlfriend” or “boyfriend”!</p>
<p>Children at summer camps (and adults on retreats or vacations) get an
entire in-camp life, squeezed somehow into a few objective days, but as
large emotionally as an entire semester at school. This makes sense if
you think about it. You’re doing completely different activities than
you’ve previously done. Each day you’re doing a lot of adjusting, a lot
of learning new routines, a lot of meeting new people. It’s just long
enough that you take it seriously as something to acclimate to,
a new context to judge everything by, but also short enough that every
individual event has an outsized importance.</p>
<p>You know you’re experiencing this effect when you say things like
“I just went swimming earlier today? That feels like three days ago.”
And the reason is simple: Three days’ worth of <em>different</em> things have
happened since then, both different from each other, and different from
what you’re used to.</p>
<p>COVID was the opposite of summer camp effect. During COVID, the set
of events and activities I was used to dwindled to naught. Leaving the
apartment at all felt like an event. “Three days ago? That felt like
earlier today,” I would say, as less than a day’s events had happened
in the past three days.</p>
<p>So then, as COVID restrictions thawed, completely normal outings, like
going to a restaurant, felt like huge accomplishments – because in
context they were. Every event where there were more than 2 or 3 people
felt like a huge, even decadent, party! Because my baseline was so
pathetically low, this was a perfect recipe for the longest-lasting
summer camp effect I’ve ever experienced.</p>
<p>For months, for basically the entire second half of 2021, subjective
time snapped back in the other direction like a rubber band. Each
week felt like a month. Things that happened a few days ago felt like
they were long-forgotten memories. So much was happening so quickly,
even if it wasn’t objectively all that much, because I was de-acclimated.</p>
<p>And then, while still de-acclimated, while still experiencing this
post-COVID perpetual summer camp effect, I up and moved to a new town.
Now that I’m in said new town, visiting new places, meeting new people,
navigating new obstacles and forging new routines, I added an additional
layer of summer camp effect on top of the “COVID recovery” time dilation
I was already experiencing.</p>
<p>But of course, settling into a new place – especially a new house, when
I’ve never owned a house before – is a lot of work, work that takes time.</p>
<p>And it’s not just the logistics and paperwork of moving (though of course
there are ungodly reams of paperwork). I have to get acquainted with this
new town, learn the dance that this town dances, a dance with a different
rhythm than I’m used to. Some processes simply can’t be rushed, and
that includes meeting new people and setting up new routines here.</p>
<p>So now I’m in a bit of a pickle: I’m experiencing time as going slower,
while also having a lot of goals that simply will take a lot of time,
no matter what I do. Between these facts, everything in my life is
taking <a href="https://www.thecodedmessage.com/images/forever.gif">forever</a>.</p>
<p>So I’m just here, spinning my wheels, particularly within my social and
personal life, making no apparent progress on my urgent goal and top
priority of developing a routine for my spare time and settling into my
new living situation. Why don’t I have a routine worked out? Why don’t I
have a full complement of local friends, a weekly game night, hobbies both
social and solitary? Am I just not cut out for small town life after all?</p>
<p>Simultaneously, I’m feeling extremely busy setting up the house: Why
isn’t it set up yet? Why is everything so untidy, again? Why do I feel
like I’m perpetually behind on making the house look remotely presentable,
or even correctly furnished? Am I just not cut out for home-ownership?</p>
<p>There’s a word for this feeling: burn-out. The popular conception of
burn-out is when you’ve worked too hard, are exhausted from it, and cannot
work anymore. And I certainly have worked hard: Getting a mortgage and
buying a house is one of the most difficult, convoluted, and bureaucratic
journeys I’ve ever undertaken. And I’ve since had to do a lot to “settle
in,” a shockingly pleasant euphemism for a deeply stressful process.</p>
<p>But that isn’t the entire picture. Burn-out doesn’t come just from working
too hard. Burn-out comes from a feeling that your hard work isn’t
accomplishing anything – that accomplishing things may be
impossible, because nothing’s happening even with all the effort put in.</p>
<p>Some examples of burn-out:</p>
<ul>
<li>Therapists get burn-out when in spite of all their efforts, they can’t fix
everything, and their patients still have the same problems.</li>
<li>Teachers get burn-out when in spite of all their efforts, whether
creative lesson plans or well-structured incentives, still leave
some students struggling at academic concepts and skills.</li>
<li>Programmers get burn-out when they spend aeons learning a new code-base
and learn all its flaws, but instead of being able to fix the flaws and
make everything easier in the long run, they just have to learn to live
with it as the technical debt gets even worse and every simple task
just takes 5x longer than it would if they could just spend some
time to clean up some things.</li>
</ul>
<p>It can be hard to recognize burn-out, because in the moment, it feels
like laziness, failure, or procrastination. But it’s not about the amount of
work. It’s about the ratio of (felt) work to (perceived) results. It’s
about the feeling that maybe if you try even harder, you’ll get better
results, when what you really need to do is step back and take stock of
things, and make sure you have the right overall approach and right goals.</p>
<p>So here’s my example of burn-out to add to the list:</p>
<ul>
<li>New residents of a town can get burnt out when they feel
like they’ve done gazillions of things, but their life in that city is
still not as full as their old pre-COVID life.</li>
</ul>
<p>But, though I may have done gazillions of things, I haven’t actually lived
here that long in objective time. I’m measuring my results in subjective
time, and that’s unfair to my efforts. Some things just can’t be rushed.</p>
<p>All I can do now is remind myself it is too early to be making
conclusions. Even if small town life is perfect for me, I shouldn’t expect
to be in my regular groove already. Even if home-ownership suits
me perfectly, I still should expect to be setting up this house for a
while. I had to start reminding myself of this the day after I moved, and
I have to keep reminding myself of this now. It’s my new constant mantra:
“Some things just take time.” And also, when that gets old, I can chant:
“It hasn’t even been that long!”</p>
<p>Because in the end, subjective time is not objective time. I can
remember that, and use that fact as the weapon to fight my unreasonable
emotions. Because in truth, all things properly considered, everything’s
going according to plan. I may not have enough friends or activities here
yet, but I’m working on it, and after all, I just moved here.</p>
More on Mortgageshttps://www.thecodedmessage.com/posts/mortgage2/2022-04-19T00:00:00+00:00Mortgage interest rates have recently risen, and are currently very volatile. At the time of this writing, PSECU, my credit union, is offering mortgages at 5.125%, much higher than the 3.125% I locked in at, but lower than the peak above 6% I had recently read about in the news. But what does this mean in practice? Well, let’s run some numbers.
Understanding how expensive a house is can be confusing.<p>Mortgage interest rates have recently risen, and are currently very
volatile. At the time of this writing, PSECU, my credit union, is
offering mortgages at 5.125%, much higher than the 3.125% I locked in
at, but lower than the peak above 6% I had recently read about in the
news. But what does this mean in practice? Well, let’s run some numbers.</p>
<p>Understanding how expensive a house is can be confusing. The total
price of a house is a huge number, more money than we normally ever
deal with, for most first-time buyers more money than they’ve
ever actually had or seen. It can be intimidating.</p>
<p>The more accessible number – the more relevant number for our day-to-day
lives – is the monthly payment. The monthly payment includes principal
and interest, and also escrow for property taxes and insurance, but
let’s focus on the mortgage and interest right now.</p>
<p>The total principal and interest portion of the payment stays constant,
but over time more of the payment goes towards principal, due to
amortization. For a standard American 30-year fixed rate mortgage,
the size of that monthly payment is a linear function of the size of
the mortgage and a non-linear function of the interest.</p>
<p>So at my interest rate of 3.125%, a $100,000 mortgage would correspond
to a monthly payment of $428. A more realistic $300,000 mortgage would
cost three times that per month, or $1285. If you assume $700 or so
in insurance and taxes, that comes out to $1985, a very reasonable
total amount. If that were a New York City rent, the occupants need to
make 40 times that monthly amount per year, or $79,400, to qualify for
the apartment, and so in my mind it’d be reasonable for a person or couple
who made that much to live in that house.</p>
<p>What about the new rate of 5.125%? The $100,000 mortgage now has a monthly
payment of $544. The $300,000 mortgage would then come out to $1633,
for an overall monthly payment (including the estimated taxes and other
escrow costs) of $2333. By New York City standards, to qualify for that
rent a group of tenants would have to make $93,320.</p>
<p>That’s a 17.5% increase in monthly costs, with the estimated monthly
escrow payment included. Without including the monthly escrow payment,
that’s a 27.1% increase in monthly payments. That means that a $300,000
mortgage at 5.125% would have the same monthly principal and interest
payments as a $381,300 house at 3.125%. If you take escrow payment
into account, the equivalence is closer to $350,000, which is still
substantially more expensive.</p>
<p>This is to say, houses have now become much more expensive for borrowers,
who represent the vast majority of middle class home buyers. The numbers
on Zillow now mean something very different from what they meant just a
few short months ago.</p>
<p>Will this dampen the insane level of real estate demand? Will this lower
home prices to accommodate the fact that many home buyers’ budgets will
naturally have shifted? Theoretically, yes, but it’s unclear whether
we’ll notice in this intense a market.</p>
<p>But it’s important to keep this in mind when thinking about houses.
The big total price of each house isn’t the only number that matters. The
interest rate can also make a huge difference, especially if it is
changing rapidly.</p>
Reviews and Reactions: 2021 Short Story Hugo Nomineeshttps://www.thecodedmessage.com/posts/hugo-2021/2022-04-10T00:00:00+00:00NB: These are for the 2021 Hugo awards, not the recently-announced 2022 Hugo awards. That one is coming soon.
I decided to write up my thoughts on each of the short stories nominated for the 2021 Hugo awards. Of course, here be spoilers, spoilers galore. If you don’t want these stories spoiled, go read them, and then come back here.
As an exercise, a friend and I read each of these stories and told each other what we thought the themes were, and I reference that throughout these reflections.<p><em>NB: These are for the 2021 Hugo awards, not the recently-announced
2022 Hugo awards. That one is coming soon.</em></p>
<p>I decided to write up my thoughts on each of the short stories nominated
for the 2021 Hugo awards. Of course, here be spoilers, spoilers galore. If
you don’t want these stories spoiled, go read them, and then come back
here.</p>
<p>As an exercise, a friend and I read each of these stories and told each
other what we thought the themes were, and I reference that throughout
these reflections. Themes, as we define them, are thematic statements:
the point the story is trying to make. Themes are distinct from thematic
concepts, in that they are complete sentences rather than just nouns.
They are distinct from premises, in that they are the take-away for
the real-world, not a statement about the world of the story. And, to be
clear, there can be more than one completely valid answer. Both my friend
and I would posit what we thought the theme was, answering independently
without consulting each other, and then we would discuss the story in
greater detail.</p>
<p>What follows are the tangible results of those discussions: reflections
about each story, somewhere between review and analysis. Each header is
also a link, because all of these stories are available to read online.
They are reviewed in descending ranked order of how good I thought they
were, and the rankings are explained.</p>
<h1 id="1-metal-like-blood-in-the-darkhttpsuncannymagazinecomarticlemetal-like-blood-in-the-dark">1. <a href="https://uncannymagazine.com/article/metal-like-blood-in-the-dark/">Metal Like Blood in the Dark</a></h1>
<p>I found this story deeply compelling as well as deeply enjoyable, and it
touches on deep questions of how we should interact with evil and what
to do about necessary evil and the corruption that results from it, and
how much we should prepare our children for it. It does so while
leaning heavily on its Sci Fi setting – the message itself
depends deeply on the unrealistic setting where it’s possible for
a child to grow into a quite capable near-adult without experiencing
the concept of intentional deception.</p>
<p>A creator and father makes one male, one female creature, and raises
them in blissful innocence, within the confines of a garden (or, as
it were, a planet), with boundaries programmed in to prevent them
from gaining too much knowledge of good and evil, keeping them
innocent – innocent, or alternatively put, naïve.</p>
<p>This telling is more optimistic than the Biblical story. Knowledge
of Good and Evil comes from practical experience and necessity, not
disobedience. Somehow, Sister (but not Brother) works out (from first
principles!) deception, lying to yourself to stay consistent, sabotage,
and killing in self-defense.</p>
<p>Her initial reaction to lying was very visceral to me as a programmer.
Programming is an exact discipline; most computer programs
cannot recover from internal data corruption. If an error is not
caught and handled, the corruption can spread, resulting in security
breaches or arbitrary instability. Much of the history of software
design is coming up with ways of partitioning this instability.
The thought of causing corruption on purpose is so counter to
all of this work!</p>
<p>And of course, that is how lies spread throughout the honest world in
human life too; we’re just so used to dishonesty we don’t realize.</p>
<p>When Sister realizes lying is possible, she realizes that understanding
lying is necessary to interact with others, and gradually realizes that
lying is something she might be forced, by circumstances, to do.
She realizes that it would therefore be impossible to go back to a world
where the only falsehoods are errors, where errors do not need to be
maintained on purpose. What a novel way to look at the loss of innocence!</p>
<p>I was surprised that, once she realizes how necessary lying is, she
decides that in any case, she wants to protect Brother from it –
another contrast to the Biblical story where Eve shares the fruit with Adam.
I feel like even this drive to preserve innocence comes from a loss of
innocence: The old Sister, the naïve Sister, would surely reason that
this new “lying” concept was an important skill, that of course it would
be practical for Brother to know about. But now that she knows how to lie,
she decides to use this new skill to protect Brother’s innocence.</p>
<p>Because she <em>is</em> lying to Brother: She lies by omission about what
happened concretely to the villain, but she also lies in a bigger sense
by omission by not explaining lying to him, and how it is sometimes
necessary. From this point on, her relationship with Brother will be
a big lie by omission about the fundamental nature of the universe –
and still a lie that she thinks is worth it, to preserve his innocence.</p>
<p>My friend says the theme of this story is “Teach your children to be good,
and then when they confront evil, they will learn to be cunning without
becoming evil themselves.”</p>
<p>But if the story is trying to say that, I don’t think it proves
it. Brother does not learn cunning; Sister is merely luckier. And with
Sister, I get the sense that she barely learns it on time, and that
her ability to leverage it so successfully on her first life or death
attempt is fundamentally, again, a lucky break.</p>
<p>I would agree that this story is about evil, particularly dishonesty, but
I would say the theme is that even the <em>concept</em> of evil and dishonesty is
fundamentally corrupting. That innocence, once lost, cannot be regained.
The concept of saying something false on purpose, to a truly innocent soul,
would itself be an irreversible corruption that she would then want to
protect others from.</p>
<p>I ranked this one first, as did my friend, independently of each other.
Later, I learned that it did in fact win the Hugo, well-deservedly.
It stood head and shoulders above the rest, in my opinion.</p>
<h1 id="2-open-house-on-haunted-hillhttpswwwdiabolicalplotscomdp-fiction-64a-open-house-on-haunted-hill-by-john-wiswell">2. <a href="https://www.diabolicalplots.com/dp-fiction-64a-open-house-on-haunted-hill-by-john-wiswell/">Open House on Haunted Hill</a></h1>
<p>This story is about places. Places often seem to have personalities;
the premise here is about what it would be like if houses literally
did have personalities – if they had personalities and could act on them.
This story is about belonging in a home, a home that is so loved
that it begins to love you back. It is about finding a place to
belong after being stricken by grief.</p>
<p>I enjoyed the detail that the couple was not yet married, but did have
a child together. It showed the motion and dynamism of their lives when
tragedy struck.</p>
<p>My friend said the theme of this story was “in our grief, it’s hard
to see what’s good for us or to see when others are trying to help
us, but goodness abounds in the world – just keep your eyes open.”</p>
<p>I agree.</p>
<p>But I think there’s also lot about place: “Choosing a good environment
is important especially if you’re emotionally developing or emotionally
healing.” As someone who’s moved not only houses but states in the past
year, that really resonates with me. As we’ve all suffered through
the lockdowns and restrictions of Coronavirus, the importance of our
environment for our sanity and stability, I think, has been magnified
for all of us.</p>
<p>It is perhaps for that reason that I rank this one second instead of
third; it was a really close one.</p>
<h1 id="3-little-free-libraryhttpswwwtorcom20200408little-free-library-naomi-kritzer">3. <a href="https://www.tor.com/2020/04/08/little-free-library-naomi-kritzer/">Little Free Library</a></h1>
<p>I really enjoyed this story. I couldn’t put it down – and my ADHD
has really been getting in the way of my reading recently, so this is
high praise.</p>
<p>I used to live in New York City, and I often would people-watch, and
think about the lives of all the people I’d see on a day-to-day basis.
Their worlds were completely separate from mine, even though, for a
few moments, we were in the same physical space. In the buildings
around me, completely different lives, completely different problems,
completely different dreams are happening, and we get continuous
brief windows to interact with them. And of course, the same is true
in other environments as well; the other worlds are just less visible
to us than they are in a big city.</p>
<p>Little free libraries are designed to encourage this, encourage this
connection with strangers, to reach out into other worlds, with the
power of books, and the power of art. Who knows what’s going on
in the lives of the people who take your books, who drop other books
off?</p>
<p>This story expanded on this theme. The narrator’s little free library,
rather than simply connecting with the “other worlds” that the neighbors
live in, connected in Narnia-like fashion to a literal other world.</p>
<p>Like many fantasy stories, this has an element of “There are more things
in heaven and earth, Horatio, than are dreamt of in your philosophy.”
As applied here, it is a reminder that the people we interact with
might be going through a completely alien situation to ours, as
symbolized through interacting with a literal alternative world.</p>
<p>A friend said the theme could be stated as “books are powerful tools,
and it’s good to help others.” I said “one person’s community engagement
art project is another’s life or death lifeline,” and I think this is true
even if they aren’t some strange birds or dragons from another dimension.</p>
<p>After all, sometimes other people’s lives, in real life, are as strange
to us as inter-dimensional birds or dragons.</p>
<p>The premise is super fun. I would have enjoyed a sequel: what is it
like to raise the heir to the bird/dragon throne, in our world?</p>
<h1 id="4-the-mermaid-astronauthttpswwwbeneath-ceaseless-skiescomstoriesthe-mermaid-astronaut">4. <a href="https://www.beneath-ceaseless-skies.com/stories/the-mermaid-astronaut/">The Mermaid Astronaut</a></h1>
<p>The premise of the story is so loud here it crowds out the theme: “The
Little Mermaid”, but in space! So full confession: I have actually not
read “The Little Mermaid,” but I have seen the Disney movie. So from
my point of view, the romance of being a crewmate of a starship
fills in for the romantic Disney prince, which is definitely
a statement about alternatives to finding a partner as a purpose for life.</p>
<p>My friend says it’s about remembering your family and where you came from –
specifically, “It’s important to explore and grow, but at the end of
the day, never forget your roots.” I had a similar thought, but a bit
starker: “Following your dreams often requires great sacrifice, in the form
of missing out on the entirety of what your life otherwise would be.” This
happens twice: she misses out on her entire life with her sister to
go to space, and she misses out on having a full life in space to ever see
her sister again. No, this story says, you can’t “have it all,” even if
that isn’t defined to include raising children.</p>
<p>The big reveal was that she was separated from her sister by time
dilation, so that she is still young when her sister is old. One thing
I found unrealistic is that no one warns her, and that she’s not angry.
Both the witch at the bottom of the ocean and the crew of the ship <em>knew</em>
about time dilation, but apparently it’s only explained to her right
on time for her to get back to her sister while the sister is barely
still alive.</p>
<p>The end result is appropriate thematically, as it stands in for careers
or dreams when we realize only when it’s almost too late – or entirely
too late – that we need to re-connect with the people that we love and
that we have left behind.</p>
<p>But that theme wouldn’t have been dampened by some more conflict, but
rather enhanced. If she were angry that she hadn’t been warned, that
would apply equally well to real-life careers. If her crewmates were more
resistant to her going back so early (and diverting their entire ship),
that would, again, apply equally well to real-life careers.</p>
<p>All in all, everyone was too chill about these high-stakes life-altering
decisions. The knife to cut her fins and the pain was good, but the
emotional pain of conflict would have been even better. In all honesty,
I suspect this story would improve with expansion – I think it would work
better as a novella or even a novel, so all these major life twists
could be fully fleshed out.</p>
<p>Because of these technical issues, and the lack of originality in the
premise without that substantial a twist (“The Little Mermaid” is also
originally about sacrifice and irrevocable major life choices), this
one ranks on the lower end. The twist was still enjoyable, but not as deep
or as well-executed as it could have been.</p>
<h1 id="5-a-guide-for-working-breedshttpswwwtorcom20200317a-guide-for-working-breeds-vina-jie-min-prasad">5. <a href="https://www.tor.com/2020/03/17/a-guide-for-working-breeds-vina-jie-min-prasad/">A Guide for Working Breeds</a></h1>
<p>This story was fun, but didn’t seem that rich.</p>
<p>It wasn’t clear, as my friend pointed out, in what way these were robots
and not just regular people. This wasn’t meant entirely literally;
the story makes constant minor references to them being robots, like
how they don’t eat food (but somehow still like omelettes) or get static
damage to their GPUs, but they’re not robots in a way that is interesting
to the plot. Robots are just another oppressed minority group, taken
advantage of by bosses via machinations of questionable legality. The
story could’ve worked equally well set in a medieval setting with some
oppressed ethnic group. It’s not using the Sci Fi for the purpose.</p>
<p>My friend said the theme was “always be nice to those you meet; goodness
is paid forward.” That’s definitely there, but I think there’s more
to it. The continuous references to labor laws, plus the odd gladiator
fights the mentor is involved in, make me think the author is going for
(and hitting) something deeper than that. “Solidarity is necessary in
the working class” I think is closer to it. “Leverage the system fully
to acquire wealth and then share with your working-class comrades.”</p>
<p>Unfortunately, it seemed too easy in general, but particularly in not
requiring the Sci Fi content for its theme (a bad sign in my book).
Why did the original mentor come around and turn from an annoyed conscript
into the mentorship program to a true friend to the mentee? No solid
explanation is given, besides raw empathy, robots’ robotity to robots.
And enjoyment of dogs is there, I suppose, as a facile personality quirk,
but not fully developed or explained.</p>
<h1 id="6-badass-moms-in-the-zombie-apocalypsehttpsuncannymagazinecomarticlebadass-moms-in-the-zombie-apocalypse">6. <a href="https://uncannymagazine.com/article/badass-moms-in-the-zombie-apocalypse/">Badass Moms in the Zombie Apocalypse</a></h1>
<p>This story wasn’t for me.</p>
<p>This is literally true, in that it seems to be for women about womanhood,
and I am not a woman. It is also more broadly true, in that
I am not a huge fan of zombie apocalypse stories in general (though
I’m not categorically opposed either), and specifically I wasn’t a
huge fan of this story.</p>
<p>This story focuses on a group of women who have decided to live in an
explicitly matriarchal and (at least initially) all-woman group to better
survive the apocalypse, which reminds me of the ancient Greek legend of
the Amazons, or the Many Mothers from <em>Mad Max: Fury Road</em>. This group
uses explicit feminist solidarity as their impetus to band together
to survive the zombie apocalypse.</p>
<p>In general, I found the story cringily on-the-nose. I thought the way it
re-applied the feminist rallying cry “my body, my choice” to be forced,
rather than insightful. In general, I found it somewhat incredible (and
therefore also forced) that even an all-woman group would be talking
so much about feminism and feminist topics while surviving a zombie
apocalypse. It seems more likely that they’d be more focused on being
people than being women, as their gender wouldn’t be the most relevant factor
in the situation.</p>
<p>There was one point within the story where it was revealed that the
protagonist had once had to come out to her racist preacher father;
it was extra difficult, because she was not only dating a woman, but
dating a Black woman. I know that these situations are all too common
in our society, but I also felt like in this story, this was gratuitous.
It was played entirely straight, just referenced as a situation that would
obviously be difficult, in a way that conveyed nothing new about such
a situation, no particular insight. I already know that such situations
exist, and are hard, and I already know that a lot of people go through
similar situations. Show me some depth, some new realization about it!</p>
<p>And that holds for this story in general. I already know that women
can be survivors too, that women is just as useful in an apocalypse
as a men, in some ways more useful. I’m even willing to believe that
a group of all women might have even better survival characteristics,
but this story didn’t convince me of it – it just asserted it.</p>
<h1 id="conclusion">Conclusion</h1>
<p>I enjoyed this activity, and I feel like I have new insights into
Sci Fi, particularly that stories are better when the Sci Fi elements
of it are more core to the theme. I hope to do more of these short
story reviews in the future.</p>
Review: The Comic Book Story of Beerhttps://www.thecodedmessage.com/posts/comic-beer/2022-04-08T00:00:00+00:00I like beer, and I like comic books, so I was excited to read The Comic Book Story of Beer.
And it was overall quite a fun read! It contextualized how important beer was in antiquity – including theories that beer catalyzed the agricultural revolution – and how important it’s been in society ever since, taking a social approach to the entire history, while also explaining a lot of the science alongside the primarily social narrative.<p>I like beer, and I like comic books, so I
was excited to read <a href="https://www.penguinrandomhouse.com/books/235303/the-comic-book-story-of-beer-by-jonathan-hennessey-and-mike-smith-artwork-by-aaron-mcconnell/"><em>The Comic Book Story of
Beer</em></a>.</p>
<p>And it was overall quite a fun read! It contextualized how important
beer was in antiquity – including theories that beer catalyzed the
agricultural revolution – and how important it’s been in society
ever since, taking a social approach to the entire history, while
also explaining a lot of the science alongside the primarily social
narrative. It was a really fun read, and I recommend it to anyone who
enjoys beer or who cares about history, which I think is most people.</p>
<p>I would state the general theme as this: “Beer always has been an essential
ingredient to civilization.” And I think it does a solid job of proving
that theme!</p>
<p>It spent some time specifically on the craft brewing revolution that
took off in the US and the UK, and is now associated with “hipsters.”
And it made me reflect a little on what a hipster was. Here’s some
things associated with hipsterdom:</p>
<ul>
<li>Living in a city after having grown up in the suburbs</li>
<li>Beards</li>
<li>And the focus of this book: Having actual variety in beer, instead of corporatized “light” American Lagers</li>
</ul>
<p>All of these are things that Boomers (especially White Boomers) for some
reason really made untrendy, and which Millenials are bringing back,
skipping the bland suburban generation(s) for an attempt to return
to some equilibrium, to a more normal state, undoing suburbanization,
white flight, and the bland corporatized beer that goes with it.</p>
<p>And of course, the irony is really the dissonance from being raised in
an environment where you weren’t expected to have this life trajectory,
but here you are.</p>
<p>Then again, I’ve been told I don’t understand hipsterdom, so take
what I say with a grain of salt.</p>
<p>In any case, I’m glad that among the decisions of our parents and grandparents
we’re reconsidering, “light” beers are among them, and if you want to learn
more about beer-making, its history, and little tidbits of scientific
details, this is a fun book.</p>
Can you reproduce it?https://www.thecodedmessage.com/posts/reproducibility/2022-03-22T00:00:00+00:00NOTE: This post has the #programming tag, but is intended to be comprehensible by everyone, programmer or not. In fact, I hope some non-programmers read it, as my goal with this post is to explain some of what it means to be a programmer to non-programmers. Therefore, it is also tagged with “nontechnical”.
What is the most important skill for a software engineer? It’s definitely not any particular programming language; they come and go, and a good programmer can pick them up as they work.<p><em>NOTE: This post has the #programming tag, but is intended to be comprehensible
by everyone, programmer or not. In fact, I hope some non-programmers
read it, as my goal with this post is to explain some of what it means
to be a programmer to non-programmers. Therefore, it is also
tagged with “nontechnical”.</em></p>
<p>What is the most important skill for a software engineer? It’s
definitely not any particular programming language; they <a href="https://www.biblegateway.com/passage/?search=Ecclesiastes%201%3A4&version=KJV">come and
go</a>,
and a good programmer can pick them up as they work. It’s not estimating
how long a project will take, as important and elusive as that skill
is – because fundamentally, no one can, and many, many programmers are
successful without having fully built up that skill.</p>
<p>No, in my learned and considered opinion, the most important skill in
a software engineer is solving – and preventing! – problems. It is
squashing and preventing “bugs” – those situations where the software
behaves in an undesirable fashion, where it fails to meet expectations,
whether or not you knew about those expectations ahead of time. That
is the crux of the software engineering skillset. Preventing and fixing
bugs is the goal which the other skills uphold, and the criterion by
which software engineering principles and practices should be evaluated.</p>
<p>My other programming posts can be understood through that lens. All my
posts on why Rust is a better programming language than C++ – the point
is that Rust, as a programming language, is top-notch bug repellant
technology. For any post about code organization and readability, the
reason it’s important for code to be organized and readable is so that
another programmer trying to find a bug is able to find it quickly,
or that a programmer trying to add a feature doesn’t end up also adding
more bugs, due to a misunderstanding of how the code works.</p>
<p>But today, I wanted to talk less about the prevention, and more
about the squashing, about what to do when you’ve found a bug.</p>
<p>So how do you squash bugs?</p>
<p>First, I want to note that the most important bug-squashing tool is the
human brain.</p>
<p>There is a tool, a type of program, called a “debugger,” but that is
less essential than you might think from the name. A debugger won’t fix
bugs for you, and unless the bug is a crash that actually happened, it
can’t even find them. If a debugger could fix – or even just find –
all your bugs, that would be almost equivalent to a program that could
write programs, because, as mentioned before, preventing, finding, and
fixing bugs is the crux of the entire job, and I know it hasn’t been
automated, because I still get a paycheck.</p>
<p>What a debugger <em>can</em> do is attach to a running program, let you run it one
line at a time instead of all at once, and let you inspect the program’s
internal state to make sure it is what you think it is. Additionally,
if there is a crash, the debugger can inspect the crash data, sometimes
in the form of what’s known as a “core dump,” and tell you what line
of code was running when the crash happened, a “backtrace” of how the
program got there, and what values were in what variables then.</p>
<p>This is all useful in the debugging process, but not essential. The
program ought to be called an “inspector” – or perhaps the “debugger’s
companion,” because as a programmer, the true debugger is you, and much
of what the “debugger” tool can do, you can do without it as well, using
(for example) more verbose log lines and error messages, and just good
old-fashioned reasoning power.</p>
<p>So what do you do, when you have a bug? Where do you start?</p>
<p>You might assume that the first thing to do when you see a bug is to try
to find out what caused it. But not only is that going to be difficult
without some initial steps, it can lead to problems, where you think
you’ve got it, you go and do your fix, think it’s better, and actually
the bug turns out to still be there.</p>
<p>No, more important than trying to guess what might have caused a bug is
figuring out how to tell when the bug is actually fixed. If all we know is
“sometimes, the app crashes,” and we change something, and the app doesn’t
crash right away, well, is that because we fixed it, or is that because
it just happened to be one of those times where the app doesn’t crash?
If I had a nickel for every time a programmer <em>thought</em> they’d fixed
a bug…</p>
<p>And this is where, although programming <a href="https://www.hillelwayne.com/post/are-we-really-engineers/">definitely
is</a> a type of
engineering – software engineering – it has an advantage over other
engineering fields. With software, we can run the same program over
and over again, often almost for free, in a way that we can’t rebuild
a bridge or dig a new mine tunnel. With software, often – not always,
but usually – we can re-run a program, do a few things, and see if the
bug arises again. And then, through experimentation, we can come up with a
procedure that allows us to <em>always</em> trigger the bug.</p>
<p>In this way, “sometimes the app crashes” can be refined through experiment to
“when you go to the settings page in particular, sometimes, the app
crashes” which can then be refined to “when you go to the settings page,
and you’re not logged in, and you’re on an iPhone from the past 3 years,
it crashes <em>every time</em>.” And now, you have a way of knowing when you’ve
fixed it. If you do those exact things, and it doesn’t crash, then you can
be confident that your fix actually took.</p>
<p>Refining the conditions in which your bug is a bug, or rather, coming
up with a list of instructions to make the bug happen on purpose –
ideally as short a list as possible – is known in the business as
“reproducing the bug.” And it is the most important skill-set in
people who are testing software.</p>
<p>Because if I’ve written some code, and someone else is testing it,
and they’ve gotten it to crash, but don’t know how or why it crashed
– or especially if they don’t even know what they were doing when it
crashed – well, that doesn’t do very much for me. I have no idea where
to even start. Because I haven’t experienced any crashes, I can’t
look at your crashes. It works on my machine. How can I solve a problem
I can’t even see?</p>
<p>So this was a problem for me when I was a lowly iPhone programmer working
in a small three-person company. My app would have error messages pop
up – and this was frequent, as my boss, who was not a programmer,
didn’t let me spend time improving how the app worked unless I was
making continuous visible progress. My boss would tell me that he’d
gotten an error message while using the app. Can I look into that?</p>
<p>I didn’t know what to do. I always looked into it when I got error
messages, and was often able to reproduce them, find them, and fix them,
but obviously my boss was better at QAing my app than I was –
at least in the triggering bugs department. So I told him so. “I don’t
know what to do – I can’t fix it unless you can reproduce it.”</p>
<p>What happened subsequently confused me. My boss would text me, giddy,
for some reason acting as if he was winning an argument against me,
saying “I reproduced it.” And then he’d send me a screenshot of the
problem. He did this repeatedly. He seemed to think “reproducing the
bug” meant screenshotting the bug in action.</p>
<p>Later I saw him in person, and he asked me about whether I’d fixed the
bug yet. I tried to explain that that wasn’t enough; what I needed was
a step by step explanation of how to make the bug happen. And he said
something along the lines of, “How can you still not believe me? I sent
the screenshot.”</p>
<p>I was flabbergasted. I said, “Wait. I’m not literally doing this as
a policy, where I don’t work on it unless you prove to me there’s an
actual problem. It’s not because I don’t believe you. It’s not because
you have to convince me it’s real before I work on it.”</p>
<p>And my boss responded, with an affected, over-the-top laugh, “Haha,
no I get it.” And then paused a second and continued, “But you kinda
are though. That’s exactly what you’re doing.”</p>
<p>Apparently, when I said “I can’t,” my boss heard “I won’t.” My boss
thought this was about me standing up for myself against potential
spurious work, and being overly strict about burdens of proof, rather
than me literally asking questions that would make it possible for me
to do my job and troubleshoot these issues.</p>
<p>At this point, I was just shocked. How did he think I was going to
do my job? Obviously I have to figure out what was going wrong. And
obviously – or at least it was obvious to me – that required follow-up
questions. Why was he so affronted by my follow-up questions? What
did he think fixing the issue looked like? Did he think I’d just say,
“Oh, error messages. I must have left some extras in. I’ll go take them
out.” No, the error messages were the visible result that happened when
the code didn’t know how to proceed to accomplish its tasks, and I had
to go deeper in to find out why that was happening.</p>
<p>Luckily, I was able to compose myself, and think quickly on my feet. I
knew I wasn’t going to be able to actually explain reproducibility
to him – after all, I just had, and he somehow misinterpreted it as
insubordination – so I fibbed a little.</p>
<p>The debugger that came with Macs for debugging iPhones looked very
sophisticated, and to use it with the app, you had to connect the phone to
the computer with a cable. This would allow you to, as I said before,
inspect the current program state, and so on. It looked like a very
useful tool – and it was. Just not as useful as the human brain,
and not something you could use to skip steps.</p>
<p>But even though what I really needed was instructions on how to make
the bug happen, I told my boss that what I needed was for the bug
to actually happen <em>while</em> the phone was plugged into the debugger.</p>
<p>This served two purposes: It made him believe me that I wasn’t just
messing with him, and it gave him a concrete reason to reproduce
the bug. Now that he had the goal of making the error message pop
up while it’s plugged in, he was able to figure out for himself
that the easiest way to do that was to come up with a way to make
it happen on purpose.</p>
<p>Now, when he found a new bug, he wouldn’t bring it to me until he was
prepared to make it happen while it was plugged in. Some of them, he
already knew how to trigger; he just hadn’t heard an adequate explanation
for why he should tell me. Other bugs, he would figure out. In either
case, once he was done, he would make the bug happen while it was
plugged in.</p>
<p>And when this happened, I could either ask him how he got it to happen,
casually, while pretending that the more important thing is that it’s
plugged in, and while he sees that I’m using a computer and therefore
“working” – or, if the steps were inconsistent or unclear, or if we’d
just gotten lucky to see an error message while attached to the debugger,
I could use information from the debugger to figure out what the program
had been doing before the error – and use this, not to fix the bug
immediately, but to figure out how to reproduce it myself.</p>
<p>So what did I learn from this experience? Well, even though the most
important tool in programming (and bug-squashing) is the human brain,
I learned that people, especially non-programmers, are more comfortable
when they see you using other, concrete, fancy-looking tools. And if
your human brain needs input to complete a task, people might be more
likely to give it to you if you pretend the computer needs that input.</p>
<p>For those of us who work primarily with our brains, this can be
frustrating and disappointing, and can lead to the need to fib a little
sometimes to accommodate these biases.</p>
<p>And additionally, I learned about the depths of the disconnect between
different levels of expertise. I thought it was obvious that to fix a
problem with some code, you had to understand it at least well enough
to make it happen again. This seemed to follow directly from first
principles, to be the only logical way that it could work, if you thought
about it. But my boss hadn’t thought about it, and didn’t understand
this. The gap in our perspective didn’t come from his lack of detailed
technical knowledge or specific technologies, but rather from his lack
of a developed intuition for how programming works.</p>
A Rust Gem: The Rust Map APIhttps://www.thecodedmessage.com/posts/rust-map-entry/2022-03-12T00:00:00+00:00For my next entry in my series comparing Rust to C++, I will be discussing a specific data structure API: the Rust map API. Maps are often one of the more awkward parts of a collections library, and the Rust map API is top-notch, especially its entry API – I literally squealed when I first learned about entries.
And as we shall discuss, this isn’t just because Rust made better choices than other standard libraries when designing the maps API.<p>For my next entry in my <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">series</a> comparing Rust
to C++, I will be discussing a specific data structure API: the Rust
map API. Maps are often one of the more awkward parts of a collections
library, and the Rust map API is top-notch, especially its
<a href="https://doc.rust-lang.org/book/ch08-03-hash-maps.html?highlight=entry#only-inserting-a-value-if-the-key-has-no-value">entry API</a> – I literally squealed
when I first learned about entries.</p>
<p>And as we shall discuss, this isn’t just because Rust made better choices
than other standard libraries when designing the maps API. Even more so,
it’s because the Rust programming language provides features that better
expresses the concepts involved in querying and mutating maps. Therefore,
this serves as a window into some deep differences between C++ and Rust
that show why Rust is better.</p>
<p>And for this post, specifically, we’ll also be discussing Java, so
this will be a three-way comparison, between Java, C++ and Rust.</p>
<h2 id="reading-from-a-map">Reading from a Map</h2>
<p>So, let’s talk about map APIs. But before we get to <code>Entry</code> and friends,
let’s discuss something a little simpler: getting an item from a
map. Let’s say we have a sorted map of strings to integers:</p>
<ul>
<li>In Java, <code>TreeMap<String, Integer></code></li>
<li>In C++, <code>std::map<std::string, int></code></li>
<li>In Rust, <code>BTreeMap<&str, i32></code></li>
</ul>
<p>Let’s also say we have a string <code>"foo"</code>, and want to know what integer
corresponds to it. Now, if we’re always sure that the string we’re
looking up is always in the map, then we know what we want: we want
to get an integer.</p>
<p>But what if we’re not sure? There are plenty of situations where we want
to read a value corresponding to the key – or do something else when
that key is not present. Maybe the value is a count, and an absent key
means 0. Or maybe the absent key means that the user has made a typo,
and needs to be informed. Or maybe the map is a cache, and the absent
key means we need to read a file or query a database. In all of these
cases, we need to know either the value, or the fact that the key
is absent.</p>
<p>Let’s see how this is handled in our three programming languages, and
how fundamental design choices in these programming languages lead to
such APIs.</p>
<h3 id="java-get-a-nullable-reference">Java <code>get</code> a (Nullable) Reference</h3>
<p>A long time ago, Java made an extreme choice in the name of simplicity:
It divided all values into a dichotomy of “primitives” and “objects.”
Primitives are passed around by implicit copy, whereas objects are
aliased through many mutable references. Objects always have optionality
built in – any object reference is automatically “nullable,” which
means you can store the special sentinal/invalid value <code>null</code> in it,
the interpretation of which varies wildly. Primitives are not optional
in this way.</p>
<p>Also for the sake of simplicity, and very relevantly to the topic at hand,
generics are only supported for object types, not primitives. That means
that map values can only ever be object types. And that means that our
map from strings to integers in Java doesn’t use Java’s primitive integer
type <code>int</code>, but rather this special wrapper/adapter type <code>Integer</code>,
which auto-casts to and from <code>int</code>, and which, like any object type,
is managed through mutable, <em>nullable</em> references. (At this point, I
for one am beginning to suspect they missed the mark on their simplicity).</p>
<p>So what’s that mean for our map? How do we find out what value
corresponds to <code>"foo"</code> in our map, or else that there is none?
Well, the method for this is called <code>get</code>, and that returns the
value in question if there is one. And when there isn’t? Well,
Java here leverages nullability, and returns <code>null</code> when there
is no value.</p>
<p>So we can write something like this:</p>
<pre tabindex="0"><code>Integer value = map.get("foo");
if (value == null) {
System.out.println("No value for foo");
} else {
int i_value = value;
System.out.println("Value for foo was: " + i_value);
}
</code></pre><p>So far, so good. But there are problems. And perhaps I’m missing some
– now is a good time to take a second, look at the code, and try to
imagine in your mind what problems there may be with this system (you
know, besides the fact that I have to use <code>i_</code> as improvized Hungarian
notation due to lack of support in Java for shadowing).</p>
<p>You have some? I’ll now list what I’ve got.</p>
<p><em>Problem the first:</em> The signature of <code>get</code> doesn’t really alert
us to the possibility of a value not being in a map. This is
the sort of “edge case” that programmers regularly forget to handle;
a programmer may know, due to their situation-specific knowledge,
that the key ought to be present, and forget to consider that the
key might not be.</p>
<p>Compilers of strongly typed languages generally work to ensure that
programmers don’t miss edge cases like this, don’t make simple “thinkos”
(typos but with thought)
or “stupid mistakes.” How’s Java hold up? Well, remember how we mentioned
that primitives can’t be <code>null</code>, but these wrapper types like <code>Integer</code>
are coercible to primitives? Well, this compiles without a word of
complaint from the compiler:</p>
<pre tabindex="0"><code>TreeMap<String, Integer> map = new TreeMap<String, Integer>();
map.put("foo", 3);
int foo = map.get("foo");
System.out.println("int foo: " + foo);
int bar = map.get("bar");
System.out.println("int bar: " + bar);
</code></pre><p>And what happens at run-time? Similar behavior to Rust’s infamous
<code>unwrap</code> function. The conversion from the nullable <code>Integer</code>
and the non-nullable <code>int</code> crashes when the <code>Integer</code> is in
fact <code>null</code>:</p>
<pre tabindex="0"><code>int foo: 3
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "java.util.TreeMap.get(Object)" is null
at test.main(test.java:12)
</code></pre><p>So you might try to fix this by querying if the key exists first:</p>
<pre tabindex="0"><code>TreeMap<String, Integer> map = new TreeMap<String, Integer>();
if (map.containsKey("bar")) {
int bar = map.get("bar");
System.out.println("int bar: " + bar);
} else {
System.out.println("bar not present");
}
</code></pre><p>But now we’ve reached <em>problem the second</em>. Unfortunately, even though
this looks like it addresses the issue, this won’t prevent the crash
either. There is nothing stopping you from putting a <code>null</code> into the map,
so this code also crashes given the right context:</p>
<pre tabindex="0"><code> TreeMap<String, Integer> map = new TreeMap<String, Integer>();
map.put("bar", null);
if (map.containsKey("bar")) {
int bar = map.get("bar");
System.out.println("int bar: " + bar);
} else {
System.out.println("bar not present");
}
</code></pre><p>So for a given key in a Java map, there are actually three possible
situations:</p>
<ol>
<li>The key is absent.</li>
<li>The key corresponds to an integer.</li>
<li>The key corresponds to one of these special <code>null</code>-values.</li>
</ol>
<p><code>get</code> can distinguish 2 from 1 and 3, but cannot distinguish between
1 and 3. <code>containsKey</code> can distinguish 1 from 2 and 3, but cannot
distinguish 2 from 3. To distinguish all 3 scenarios, and handle
all the representable values, you need to call <em>both</em> <code>get</code>
and <code>containsKey</code>:</p>
<pre tabindex="0"><code>if (map.containsKey("bar")) {
Integer bar = map.get("bar");
if (bar == null) {
System.out.println("bar present and null");
} else {
int i_bar = map.get("bar");
System.out.println("int bar: " + i_bar);
}
} else {
System.out.println("bar not present");
}
</code></pre><p>In addition to this precaution not being enforced to the compiler,
it leads to <em>problem the third</em>: We are now querying the map twice.
We are walking the tree twice with our <code>containsKey</code> followed by
<code>get</code>.</p>
<p>At this point, we find ourselves scrolling through the <code>Map</code> methods
in <a href="https://docs.oracle.com/javase/8/docs/api/java/util/Map.html">Java’s documentation</a>, trying to find a more general solution. <code>getOrDefault</code> might
help in some situations – when there’s a value that makes sense as the
default. <code>compute</code> might be useful – if we’re OK with modifying
the map in the process.</p>
<p>But in general, nothing clean exists to tidy up these problems. And the
blame lies squarely on Java’s decision to make almost all types –
and all types that can be map values – nullable.</p>
<p>But wait! – you might object – Can’t we just maintain an invariant on the
map that it contains no <code>null</code> values? If we have a map without <code>null</code>
values, all these issues – well, many of these issues – dry up.</p>
<p>And this is true. Maintaining such an invariant makes for a much cleaner
situation. Pretend you aren’t allowed to put nulls in maps, and arrange
not to do it.</p>
<p>But, first off, maintaining an invariant like this is easier said than
done. Programmers often do this sort of thing implicitly in their
head, but it’s much better to comment. Either way, you have to
trust future programmers – even future versions of the same programmers
– to know about the invariant, either by intuiting it (all too common)
or by reading the relevant comment (which, even if there is one, might not
happen). And you have to trust them to not intentionally violate the
invariant, and also to not accidentally violate the invariant: Are
they sure that all those values they add to the map can never be null?</p>
<p>And second off, somewhat shockingly, sometimes people do assign special
meanings to <code>null</code>. I said before <code>null</code> has a wide range of meanings,
and it’s not uncommon to use <code>null</code> to mean special things. Maybe
“not mapped” means “load from cache,” but “null” means “there actually
is no value and we know it.” Or maybe the opposite convention applies.
<code>null</code> is frustratingly without intrinsic meaning.</p>
<p>For such situations, programmers should probably compose the map with <a href="https://docs.oracle.com/javase/8/docs/api/java/util/Optional.html">other
types</a>
or better yet, write custom types that make the semantics of these
situations abundantly clear. But let’s not put all the blame on
the programmers. If Java had really wanted to protect people from
distinguishing these “not mapped” and “mapped to null” situations, Java
maps shouldn’t have made the distinction representable at all. It’s bad
programming language design to put features in a library that can only
be abused, and it’s bad understanding of human nature to then solely
blame the programmers for misusing them.</p>
<h3 id="c-no-nulls-no-more">C++: No Nulls No More</h3>
<p>So now we move on to C++.</p>
<p>In C++, fewer types are nullable, and non-nullable types like <code>int</code>
<em>can</em> be used as the value type of a map. For our map, of type
<code>std::map<std::string, int></code>, we no longer have the trichotomy of
“key not present, value null, or value non-null,” but the much more
reasonable dichotomy of either the key is present and there is an <code>int</code>,
or it’s absent and there isn’t one.</p>
<p>This is, in my mind, the bare minimum a strongly typed language should
be able to provide, but after the context of Java it’s worth pointing out.</p>
<p>There are three (3) methods in C++ that look like they might be usable
as a <code>get</code> operation, an operation where we either get an <code>int</code> value
or learn that the key is absent:</p>
<ul>
<li><a href="https://en.cppreference.com/w/cpp/container/map/at"><code>at</code></a></li>
<li><a href="https://en.cppreference.com/w/cpp/container/map/operator%5Fat"><code>operator[]</code></a></li>
<li><a href="https://en.cppreference.com/w/cpp/container/map/find"><code>find</code></a></li>
</ul>
<p>See if you can identify which one is the right one to use.</p>
<p>Spoiler alert! It’s <code>find</code>, the one whose name superficially looks least like
it’ll be the right one. <code>at</code> throws an exception if the key is absent,
and <code>operator[]</code>, the one with the most appealing name, is an eldritch
abhomination which we’ll discuss and condemn later.</p>
<p>But all well-deserved teasing aside, <code>find</code> is much better than
Java’s <code>get</code>. It returns a special object – an iterator – that
can be easily tested to see whether we’ve found an <code>int</code>, and easily
probed to extract the <code>int</code>.</p>
<pre tabindex="0"><code>auto it = map.find(key);
if (it == map.end()) {
std::cout << key << " not present" << std::endl;
} else {
std::cout << key << " " << it->second << std::endl;
}
</code></pre><p>This is actually pretty good! The <code>-></code> operator also serves as a signal
to experienced C++ programmers that we’re assuming that <code>it</code> is valid:
generally <code>-></code> or <code>*</code> means that the object being operated on is
“nullable” in some way.</p>
<p>So when a C++ programmer reads something like this, they have a little
bit of warning that they’re doing something that might crash:</p>
<pre tabindex="0"><code>int foo = map.find(key)->second;
</code></pre><p>And certainly, they have more warning than the Java programmer with
the equivalent Java:</p>
<pre tabindex="0"><code>int foo = map.get(foo);
</code></pre><p>Of course, this is awkward. <code>find</code> returns an <em>iterator</em>, which isn’t
exactly the type we’d expect for this “optional value” situation. And
to determine if the value isn’t present, we compare it to <code>map.end()</code>,
which is a weird value to compare it to. Nothing about what these things
are named is specifically intuitive, and people would be forgiven for
using the accursed <code>operator[]</code>. <code>map["foo"]</code> just <em>looks</em> like an
expression for doing boring map indexing, doesn’t it?</p>
<p>And what does <code>operator[]</code> do, if the key isn’t present? It inserts the
key, with a default-constructed value. No configuration is possible of
what value gets inserted, short of defining a new type for the object
values. This is sometimes what you want – like if your value type has a
good default (especially if you defined it yourself), or if you’re about
to overwrite the value anyway. But in most cases, you want some other
behavior if the value is not present – <code>operator[]</code> doesn’t really tell
you that it inserted the item, so if you need to make a network query
or read a file or print an error, you’re out of luck. <code>operator[]</code>,
as innocuous as it looks, has surprising behavior, and that is not good.</p>
<p>But all in all, as far as getting values goes, as far as querying the map
goes, C++ is doing OK. Solid B result on this exam, I think. Decent work,
C++. Especially since we just looked at Java.</p>
<h3 id="the-rust-option">The Rust <code>Option</code></h3>
<p>So now on to Rust: we want to query our <code>BTreeMap<&str, i32></code>.</p>
<p>(Or… it might be a <code>BTreeMap<String, i32></code>, depending on whether we
want to own the strings. This is a decision we also have to make in C++
(where we could have used <code>string_view</code>s as the keys), but do not have
to make in Java. At least in Rust, we know that whichever decision we
make, we will not accidentally introduce undefined behavior. But that’s
a distraction!)</p>
<p>So let’s apply the same test to Rust as we’ve applied before.
Here, the <a href="https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#method.get">method in
question</a> is given an obvious name, <code>get</code> rather than <code>find</code>. So let’s
see how it does in our test, of allowing us to read a value if present,
but know if not:</p>
<pre tabindex="0"><code>if let Some(val) = map.get(key) {
println!("{key}: {val}");
} else {
println!("{key} not present");
}
</code></pre><p>See, <code>get</code> returns an <code>Option</code> type. Therefore, unlike in C++, we can test
for the presence of the value and extract the value inside the same <code>if</code>
statement. Unlike in C++, the return value of <code>get</code> isn’t a map-specific
type, but rather the completely normal way to express a maybe-present
value in Rust. This means that if we want to implement defaulting, we
get that for free by using the <code>Option</code> type in Rust, which implements
that already:</p>
<pre tabindex="0"><code>// Let's say missing keys means the count is 0:
let value = *map.get("foo").unwrap_or(&0);
</code></pre><p>Similarly, calling <code>is_none()</code> or pattern-matching against <code>None</code> is
much more ergonomic than comparing an iterator to <code>map.end()</code>. It requires
some more intimate knowledge – or some follow-up reading – to learn that
the concept of “end of collection” and “not found” are for various reasons
combined into one in C++.</p>
<p>So while C++ avoids the problematic elements of Java maps, Rust does so
more ergonomically, because it has a well-established <code>Option</code> type. C++
now has one as well, <code>std::optional</code>, but it hasn’t yet reached its
<code>map</code> API, because it was only added very recently, in C++17.</p>
<p>And <code>Option</code> integrates even better than <code>std::optional</code> with the programming
language, because <code>Option</code> is just a garden-variety sum type, a
Rust <code>enum</code>, which lets you do things like <code>if let Some(x) = ...</code>,
and combine testing and unpacking in the same statement. C++ could
not design a map API this ergonomic, because they lack this fundamental
feature.</p>
<p>Also, unlike with <code>null</code> in Java, if you want to use <code>Option</code>
as a meaningful distinction in your map, you still can. The <code>get</code>
function would then return <code>Option<Option<...>></code> instead of
just <code>Option</code> – the outer one representing presence, the inner one
representing whether the value was <code>None</code> or <code>Some(...)</code>. <code>Option</code>
is composable in a way that <code>null</code> is not.</p>
<p>For the record, the Rust equivalent to <code>operator[]</code> – the <code>Index</code>
trait implementation on maps – does the equivalent to C++ <code>at</code>, and
panics if the key isn’t present. While not as generally useful as <code>get</code>,
I think this is a reasonable interpretation of what <code>map["foo"]</code> should
mean.</p>
<h2 id="mutation-station">Mutation Station</h2>
<p>So Rust wins, I’d say pretty handily, when comparing how to access a
value from a map, how to query them. But where Rust truly shines is when
<em>mutating</em> a map. For mutation, I’m going to approach the discussion
differently. I’m going to start by specifying what use cases might exist,
and then, in that context, we can discuss how an API might be built.</p>
<p>The mutation situation has a similar dilemma to querying: the key in
question might or might not already be in the map. And, for example,
we often want to change the value if the key is present, and insert a
fresh value if the key is absent.</p>
<p>Of course, we could always check if the key is present first, and
then do something different in these two scenarios. But that has
the same problem we already discussed for querying: We then have
to iterate the tree twice, or hash the key twice, or in general
traverse the container twice:</p>
<pre tabindex="0"><code>auto it = map.find(key); // first traversal
if (it != map.end()) {
return it->second;
} else {
int res = load_from_file(key);
map.insert(std::pair{key, res}); // second traversal
return res;
}
</code></pre><p>So what should we do for our API for this scenario, where we want to
change the value if the key is present, and insert a fresh value if
the key is absent?</p>
<p>Well, sometimes that fresh value is a default value,
like if we’re counting and the key is the thing we’re counting – in that
case, we can always insert 0. In that case, C++’s <code>operator[]</code> – when
combined with an appropriate default constructor – can actually
work well.</p>
<p>And sometimes, that fresh value depends on the key, like if the value is a
more complicated record of many data points about the item in question.
If the value is a sophisticated OOP-style “object,” and the key indexes
one of the fields also contained in the value, C++’s <code>operator[]</code> would
not work. The default value is a function of the key.</p>
<p>And sometimes, there isn’t a default value <em>per se</em>. Sometimes, if the key
is absent, we need to do additional work to find out what value should
be inserted. This is the case if the map is a cache of some database,
accessed via IPC or file or even Internet. In that situation, we only
want to send a query if the key is not present. We would not be able
to accomplish our goals simply provide a default value when sending the
mutation operation.</p>
<p>C++ doesn’t have anything for us here. <code>operator[]</code>
is pretty much its most sophisticated “query-and-mutate”
operation. Java, somewhat surprisingly, does have something relevant,
<a href="https://docs.oracle.com/javase/8/docs/api/java/util/Map.html#compute-K-java.util.function.BiFunction-"><code>compute</code></a>.
This handles all of these situations, with a relatively unergonomic
callback function – and as long as your map never contains <code>null</code>s.</p>
<p>Rust’s solution, however, is to create a value that encapsulates
being at a key in the map that <em>might or might not</em> have a value
associated with it, a value of the
<a href="https://doc.rust-lang.org/std/collections/btree_map/enum.Entry.html"><code>Entry</code></a> type.</p>
<p>As long as you have that value, the borrow checker prevents you from
modifying the map and potentially invalidating it. And as long
as you have it, you can query which situation you’re in – the
missing key or the present key. You can update a present key. You can
compute a default for the missing key, either by providing the value or
providing a function to generate it. There are many options, and you can
read all of them in the <code>Entry</code> documentation; the world is your oyster.</p>
<p>So the C++ code above can be ergonomically expressed as something like
this in Rust:</p>
<pre tabindex="0"><code>let entry = map.entry(key.to_string());
*entry.or_insert_with(|| load_from_file(key))
</code></pre><p>And the idiom where we’re counting something could be expressed
something like:</p>
<pre tabindex="0"><code>map.entry(string)
.and_modify(|v| *v += 1)
.or_insert(1);
</code></pre><p>So we get this nice little program that counts how many times
we use different command line arguments:</p>
<pre tabindex="0"><code>use std::collections::BTreeMap;
use std::env;
fn count_strings(strings: Vec<String>) -> BTreeMap<String, u32> {
let mut map = BTreeMap::new();
for string in strings {
map.entry(string)
.and_modify(|v| *v += 1)
.or_insert(1);
}
map
}
fn main() {
for (string, count) in count_strings(env::args().collect()) {
println!("{string} shows up {count} times");
}
}
</code></pre><h2 id="conclusion">Conclusion</h2>
<p>So first off, <code>Entry</code>s are super nice, and neither Java nor C++ has
anything anywhere near as nice. Even when it comes to just querying,
Rust’s <code>get</code> is much better than Java’s <code>get</code>, and a little more ergonomic
than C++’s <code>find</code>.</p>
<p>But this isn’t an accident. This isn’t just about Rust’s map API having
a nice touch. When we look at the definition of <a href="https://doc.rust-lang.org/std/collections/btree_map/enum.Entry.html"><code>Entry</code></a>,
we see things that Java and C++ can’t do:</p>
<pre tabindex="0"><code>pub enum Entry<'a, K, V>
where
K: 'a,
V: 'a,
{
Vacant(VacantEntry<'a, K, V>),
Occupied(OccupiedEntry<'a, K, V>),
}
</code></pre><p>First, this is an <code>enum</code>: There’s two options, and in both option,
there’s additional information. Of course, Java and C++ can express
a dichotomy between two options, but it’s a lot clumsier. Either you’d
have to use a class hierarchy, or <code>std::variant</code>, or something else. In
Rust, this is as easy as pie, and since it does it the easy way, you can
not only use the various combinator methods in Rust, you can also use
<code>Entry</code>s with a good old-fashioned <code>match</code> or <code>if let</code> to distinguish
between the <code>Vacant</code> and <code>Occupied</code> situation.</p>
<p>Second, there’s a little lifetime annotation there: <code>'a</code>. This is
an indication that while you have an <code>Entry</code> into a map, Rust won’t
let you change it. Now, in Java and C++, there’s also iterators,
which you may not change a map while you’re holding, but in both
those languages, you have to enforce that constraint yourself.
In Rust, the compiler can enforce it for you, making <code>Entry</code>s
impossible to use wrong in this way.</p>
<p>Without both of these features, <code>Entry</code> would not have been an obvious API
to create. It would’ve been barely possible. But Rust’s feature set encourages
things like <code>Entry</code>, which is yet another reason to prefer Rust over C++
(and Java): Rust has <code>enum</code>s (and lifetimes) and uses them to good effect.</p>
<h2 id="addendum">Addendum</h2>
<p>I wanted to address a few points that people have raised in comments
since I posted this.</p>
<p>Some people have pointed out that C++ has <code>insert_or_assign</code>,
but in spite of the promising name, it just unconditionally sets a key
to be associated with a value, whether or not it previously
was. This is not the same as behaving differently based on
whether a value previously existed, and it is therefore not
relevant to our discussion.</p>
<p>More interestingly, it has been pointed out to me that with
the return value of <code>insert</code>, you can tell whether the <code>insert</code>
actually <code>insert</code>ed anything, and also get an iterator to the entry
that existed before if it didn’t. This allows implementing some, but not
all, of the patterns of <code>Entry</code> without traversing the map twice.</p>
<p>For example, counting:</p>
<pre tabindex="0"><code>int main(int argc, char **argv) {
std::vector<std::string> args{argv, argv + argc};
std::map<std::string, int> counts;
for (const auto &arg : args) {
counts.insert(std::pair{arg, 0}).first->second += 1;
}
for (const auto &pair : counts) {
std::cout << pair.first << ": " << pair.second << std::endl;
}
return 0;
}
</code></pre><p>This works, but is much less clear and ergonomic than the <code>Entry</code>-based
API. But perhaps more importantly, this functionality is much more
constrained than <code>Entry</code>, and is equivalent to using <code>Entry</code> with just
<code>or_insert</code>, and never using any of the other methods. As another
commentator pointed out, counting is possible with just <code>or_insert</code>:</p>
<pre tabindex="0"><code>*map.entry(key).or_insert(0) += 1
</code></pre><p>But counting is just one example. C++’s <code>insert</code> is still deeply
limited. Using C++’s <code>insert</code> means you have to know <em>a priori</em> what
value you would be inserting. You can’t use it to notice that a key is
missing and then go off and do other work to figure out what the value
should be. So you can’t do my <code>load_from_file</code> example.</p>
<p>In order to do the <code>load_from_file</code> example in C++, even with this use of
<code>insert</code>, you would have to temporarily insert some sentinal value in the
map – and that goes against how strongly typed languages ought to work,
in addition to breaking the C++ concept of exception safety.</p>
<p>This is, as was pointed out in another comment, exactly what C++
programmers sometimes have to do, to meet performance goals, at the
expense of clarity and simplicity, and therefore, especially in C++,
at the expense of confidence in safety and correctness.</p>
Biking to Phillyhttps://www.thecodedmessage.com/posts/biking-to-philly/2022-03-07T00:00:00+00:00I am out of biking shape. I know I am out of biking shape. The pandemic has not been good to my physical fitness. (For the record, this isn’t a proper edited and outlined and triaged essay, just some notes on my past weekend.)
But as out of shape as I am, I also know it’s only 25 miles from here to Philly on the Schuylkill River Trail, and so I figured maybe I could do it without any additional prep.<p>I am out of biking shape. I know I am out of biking shape. The pandemic
has not been good to my physical fitness. (For the record, this isn’t
a proper edited and <a href="https://www.thecodedmessage.com/posts/crank-em-out/">outlined and triaged</a> essay,
just some notes on my past weekend.)</p>
<p>But as out of shape as I am, I also know it’s only
25 miles from here to Philly on the <a href="https://schuylkillriver.org/schuylkill-river-trail/">Schuylkill River
Trail</a>, and so I
figured maybe I could do it without any additional prep. When I found
out that it was less hilly than the longer bike rides I used to do,
I was sold, and I did it.</p>
<p>And I got there, with many breaks, in much more time than it should have
taken. Now I know how out of biking shape I am, and I can work on it. And
I was fortunately able to cancel the trip back and take my bike on
the train back with me (which a woman asked me if she could take a picture
of to prove to her husband it was possible).</p>
<p>First off, this trail is perhaps some of the safest cycling – especially
on a per mile basis – that I’ve ever done. It’s fully protected, almost
entirely paved (and the unpaved bits are fine), and only interacts with
cars for a very safe 3 miles in Manayunk.</p>
<p>In fact, the safety and the lack of interaction with traffic presented
an unexpected problem to me: parts of the ride let me zone out enough
that I would get a little… bored. This meant that I would focus less,
which meant I would slow down, which meant I would get even more bored.
I should have a podcast with me, which means I’ll need to
take a battery pack with me so my phone doesn’t die as I listen to it.
Lesson learned.</p>
<p>But with all this safety, I was so confused to see so many people
biking in full gear, with reflector shirts and their lights on
during the daytime. Like, are you worried about getting hit by a deer?</p>
<p>I’m used to biking in NYC, interacting with the traffic all the
time, and this helps my anxiety, as you’re forced to pay very
close attention to actual potential dangers. This was more just…
exercise.</p>
<p>There were some fun surprises! Somewhere around the halfway point,
I saw a sign for “The Tricycle,” and much to my pleasant surprise,
there it was! A bicycle shop and cafe! I had a coffee and a delightful
conversation with some of the other patrons. It was super refreshing.</p>
<p>When I got to my destination, I sounded like I was dying. I was coughing
like an elderly smoker with the flu, and I don’t even smoke anymore.
What is even the point of not smoking if you’re still going to cough like
that? (I’m kidding! Kidding!)</p>
<p>Biking in Philly itself was a different story. Cars drive faster
in Philly than in NYC, there’s fewer traffic lights and more
all-way stops, which is terrifying, and I had a minor crash, a “trolly
track spill,” because it was raining and my wheel got caught in
the trolly tracks.</p>
<p>I was fine, and it was only a $11 repair (with tax) to re-true the wheel and
readjust the breaks, but oy gevalt! (und “<a href="https://en.wiktionary.org/wiki/%D7%92%D7%A2%D7%95%D7%95%D7%90%D6%B7%D7%9C%D7%93#Yiddish">Gewalt</a>” meine ich Buchstäblich…)</p>
<p>So in conclusion: I should do this <em>way</em> more, because this is a great
way to get some exercise and some sociality in on a weekend! Hopefully
next time I can bike back too.</p>
Crank-'em Outhttps://www.thecodedmessage.com/posts/crank-em-out/2022-03-04T00:00:00+00:00For a time, I tried to cultivate an interest in Go. Not this Go, but this Go. The interest didn’t last long – like chess, I had a hard time getting up to even a fairly basic level of competence. And I quickly developed another enthusiastic interest to replace it – sometimes, an interest just doesn’t work out, and it’s nobody’s fault, and you have to just move on and not get too sad, because there’s plenty of fish in the sea.<p>For a time, I tried to cultivate an interest
in Go. Not <a href="https://go.dev/">this Go</a>, but <a href="https://en.wikipedia.org/wiki/Go_%28game%29">this
Go</a>. The interest didn’t
last long – like chess, I had a hard time getting up to even a fairly
basic level of competence. And I quickly developed another enthusiastic
interest to replace it – sometimes, an interest just doesn’t work out,
and it’s nobody’s fault, and you have to just move on and not get too
sad, because there’s plenty of fish in the sea.</p>
<p>But one memory from this particular interest sticks out. I was
reading a list somewhere on the Internet about Go etiquette,
specifically the etiquette of playing with a more advanced Go
player. Some of it was obvious – or at least obvious if you
thought about it – like showing gratitude that they’re playing
with you and remembering that they’re doing you a favor.</p>
<p>But there was one piece of advice that was less obvious: Don’t
spend too much time making a move. Basically, this came out to
“don’t waste their time,” but what it also came down to was,
“your thoughts won’t help you.” They gave an example of a move
someone had thought for 5 minutes before making, as if it was
an obviously horrible move. It was not at all obvious to me why it was
horrible, and I realized it was a
<a href="https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect">Dunning-Kruger</a>
type of situation.</p>
<p>Basically, beginners at Go will often think very hard about their moves,
trying to make the most out of their time with an expert, trying to not
lose the game. But they don’t know how to think very hard about their
move – their thoughts are unsupervised, and likely to miss something
an expert would find very obvious. Their thinking time will be trying
to avoid an outcome that is unlikely, and miss an easily-avoidable doom.</p>
<p>Rather than thinking beyond their actual skill level, their time is
better spent getting through the game, and spending a more proportionate
amount of time thinking. They will learn more this way, certainly more
per minute of time spent – and in the meantime, they will avoid annoying
a potential future mentor.</p>
<p>When I read this advice – and this was over 10 years ago – a light-bulb went
on in my head. I realized this wasn’t just about Go, but about any skill.
Perfectionism in early stages can be a self-defeating game.</p>
<p>A sign in my high school orchestra room boldly proclaimed: “Practice
doesn’t make perfect; perfect practice makes perfect,” calling out
people who just blew through exercise after exercise or song
after song without actually
trying to improve or find problems to fix. And that is truly a problem. But
the opposite problem is also possible, where you work too hard on
fixing all the problems in something, where you try to make your
practice too perfect.</p>
<p>For another example: A friend of mine was once trying to learn German from
scratch, using DuoLingo. I wanted to help her, saying, “try pronouncing
these words aloud” or “read along with the lyrics to this song,” just
trying to get her used to how letters corresponded to sound, and pick
up a few words, and gain some familiarity with how they might be used,
but she wasn’t interested in that, because she wouldn’t be able to
perfectly understand everything.</p>
<p>Instead, she was stuck, completely stuck, on repeatedly saying very
simple phrases into DuoLingo, trying to make her accent 100% perfect,
saying, over and over again, for hours, “Das Frühstück hier ist lecker,
oder?” until the phrase, with her particular pronunciation of it, was
entirely drilled into my brain. She couldn’t hear what was wrong with
her pronunciation, and she didn’t believe me that she was focusing on
the wrong things. She didn’t believe that her pronunciation would get
better with time, that her time was better spent moving on and learning
more things to say. Besides, I said, it’s a far more practical skill to
speak German with an accent than to say one or two phrases flawlessly.</p>
<p>But more importantly, you can’t get flawless pronunciation without
learning more German. Needless to say, my friend’s approach made the process
completely unengaging, and she soon gave up. But even if she had persisted,
she would never have gotten it. Accents need to “click,” and that simply
was not going to happen with only the one sentence she kept repeating in
her repertoire. If it did, it would be imitation, and she would have no
understanding of how those sounds changed in other contexts. She was insisting
on starting with an imaginary, useless skill.</p>
<p>So how does this apply to me?</p>
<p>Well, this blog is finally moving again. And one of the (many) things
I had to overcome in getting it moving is spending too much time trying
to make each post “just right.” It’s not so much quantity over quality
<em>per se</em>, but recognizing where there were diminishing returns, at my
current skill level, to the work I was doing, and where posting (or not)
and moving on to the next post would get me much better bang for my buck.
They say that you need to write a million words of garbage before you’re
a good writer, and so far my blog only clocks in at around 80K, which I
think means I’m not quite there yet. But even when I’m further along,
I’ll need to use my time where it’s most productive, rather than just
spinning my wheels or trying to play 4D chess with my flawed mental
model of my reader.</p>
<p>It’s all about the time management.</p>
<p>Which brings me to a related time management principle for me: Having lots
of projects in flight at once. This one probably is more specific to me
and people whose brain works more like mine (not just ADHD
but perhaps even my particular flavor of ADHD), and it comes with the caveat
that it only works for me when I have a well-maintained organizational
system, but it’s a principle I’ve found super useful personally.</p>
<p>The reason to have multiple projects in flight at once, to be clear,
is so that you can choose a project based on where you are at the time,
and therefore always be making progress.</p>
<p>This has led to conflict in work environments. I remember when I was working
as a web (frontend!) developer between junior and senior years of college,
and there were dozens of tickets. I would pick off easy tickets sometimes
in favor of the higher-priority harder tickets, especially early in the
week, early in the day, or when switching to a new part of the codebase.
I would do this fundamentally as a warm-up, a way to still be productive
while I was gearing up and building up momentum.</p>
<p>My supervisor at the time called me out, told me I was supposed to
focus on the higher-priority tickets. I told him that I saw them, had
looked at them, didn’t know immediately how to do them, and instead
of actively wracking my brain trying to figure it out, I sort of let
myself process them in the background while doing some easy tickets. I
told him that the alternative was taking just as long (maybe slightly
shorter but honestly maybe even slightly longer) with the harder tickets,
but not getting any easy tickets done in the meantime. Additionally,
wasn’t I getting the harder tickets done at a reasonable enough rate?</p>
<p>I don’t know if he really believed me, but at the end of the summer they
wanted me to take a break from college to work with them during the year,
so it couldn’t have gone too poorly. I’ve since gotten better at managing
my focus in a work setting, designing plans and TODO lists that work
well for how I focus, and communicating pro-actively with supervisors.</p>
<p>But regardless of how well this principle applied (or didn’t) at that
job, or at jobs since then, it applies super well to this blog.</p>
<p>I currently have, at the time that I am writing this paragraph, 4
on-going drafts for blog posts (3 non-technical, 1 technical). I have notes
for 18 essays, notes and/or additional drafts for 7 fiction projects,
and notes (ranging from spattered ideas to relatively complete outlines)
for 23 tech blog posts.</p>
<p>When I have an idea – sometimes a new one, sometimes a thought that
fits into an existing one – I can write it down in the appropriate place
in my notes. When it’s time for me to write – I have goals on how many
posts to finish per month – I can go look through the notes for one that
is ready to work on turning into actual prose. And when I do, I can be
picky: I can find something that I’m in the mood to work on right now.</p>
<p>If none of them really are ready to start cranking out prose, I can
start developing one of the outlines a little more, which is often
necessary. Every once in a while, even if I don’t plan on writing prose
yet, I find myself thinking of good ideas, and start scrolling through
the lot of them, tweaking notes, adding in detail, until they’re
ready to mature.</p>
<p>This has been a really good system for me, and has allowed me to
split up what can seem like super monumental tasks into actual
achievable chunks. If I can’t stay focused on a single piece long enough
to write it, I can put it on the back burner, come back to it, and still
be producing content in the meantime, sometimes from newer ideas,
but often from pieces that I previously put on the back burner.</p>
<p>I’m looking forward to seeing how well it works for fiction. I’ve
still not fully dusted off my fiction projects, though I hope to soon.
But I hope that the non-fiction, essay writing type of stuff – and
maybe even the tech writing – still count towards that 1 million words,
and still help me become a better fiction writer when I get back to it.</p>
<p>In the meantime, I’ll keep crankin’ out the posts.</p>
The Good Ol' Days of QBasic Nibbleshttps://www.thecodedmessage.com/posts/qbasic-nostalgia/2022-02-28T00:00:00+00:00Let’s talk about an ancient programming language! I think we can all learn things from history, and it gives us grounding to realize that our time is just one time among many, to see what people in the past did differently, what they got wrong that we would never do now, and also to see what they got right.
Do you remember MS-DOS? Do you remember that it came with an interpreted programming language?<p><img src="https://www.thecodedmessage.com/images/nibbles1.png" alt="QBasic Nibbles Splash Screen"></p>
<p>Let’s talk about an ancient programming language! I think we
can all learn things from history, and it gives us grounding
to realize that our time is just one time among many, to see
what people in the past did differently, what they got wrong
that we would never do now, and also to see what they got right.</p>
<p>Do you remember MS-DOS? Do you remember that it came with
an interpreted programming language? From MS-DOS 5 onwards,
it came with not Python, not Javascript or R or Matlab,
but a <a href="https://en.wikipedia.org/wiki/QBasic">dialect</a>
of <a href="https://en.wikipedia.org/wiki/BASIC">BASIC</a>. But
I think most people, especially most people my age who
were children at the height of the MS-DOS era, remember
it for the games, the two sample programs that came with it, namely
<a href="https://www.howtogeek.com/779956/gorilla.bas-how-to-play-the-secret-ms-dos-game-from-your-childhood/">Gorillas</a>
and Nibbles (their name for Snake).</p>
<p>Nibbles is extra near and dear to my heart because not only is it the
game that I better enjoyed, but more interestingly because it’s the
first “large program” that I ever did work on
(for me as a child, “large” meant multiple subroutines), and the first
existing program I ever modified.</p>
<p>So recently, I tried to see if I could find it. And indeed, I could.
I just needed DosBox, the <a href="https://www.qbasic.net/en/top-ten-downloads/">QBasic
interpreter</a> (you want
QBasic EN 1.1), to run it. After that, you just need the <a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS">program
itself</a>, after which, you can throw them in a directory, “mount” it from
inside DosBox, and run <code>QBASIC.EXE</code> and use its very discoverable
interface (by 90’s standards).</p>
<p><img src="https://www.thecodedmessage.com/images/nibbles4.png" alt="QBasic Nibbles in Action"></p>
<p>It looks a little less impressive in such a small little emulation
window, but of course at the time it took the entire screen of an
entire CRT monitor, and was the best technology available for me
to interact with.</p>
<p>Nibbles was a sample game designed for you to learn to program as well
as having fun with. True to its time, it had a little set-up interface
where you answered questions in a very basic prompt-and-respond TUI
before you could
start playing:</p>
<p><img src="https://www.thecodedmessage.com/images/nibbles2.png" alt="Prompt and Response TUI"></p>
<p>You ate numbers going from 1-9, which were easy to
display – the program, though a video game, runs in text mode! –
but at the time I just appreciated that it helped you keep
track of how far along in the level you were.
So I decided to take a look at the
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS">code</a> and discuss it a little bit.</p>
<p>The first thing that struck me was how short it was – at 721
lines, this is a rather short source file, a “simple” module or
class, let alone a whole program! I suppose things do
<a href="https://xkcd.com/255/">seem bigger</a> when you’re a kid.</p>
<p>But also, I didn’t view it as one block of size-12 text on a
high-resolution monitor. I read it in QBasic’s built-in
code browser where it showed up as 14 different logically separate parts,
at the time an overwhelming number:</p>
<p><img src="https://www.thecodedmessage.com/images/nibbles6.png" alt="Subroutine/Function Selection"></p>
<p>And this is then what the subroutine would look like:
<img src="https://www.thecodedmessage.com/images/nibbles7.png" alt="Subroutine Definition"></p>
<p>Code browsers are great, and this interface is a solid reminder that
subroutines are a very early form of modules, especially given that
in QBasic, these subroutines could contain their own
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L103-L108">sub-subroutines</a>
using the more traditional
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L88"><code>GOSUB</code> command</a>.</p>
<p>So let’s talk about this programming language and program that once
people used to get real work (and real play) done.</p>
<p>First off, we
see some <a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L83-L85">mutable global variables</a>, a big no-no by modern standards,
but can you really blame them when their scope is no larger than that
of a small modern class, where the fields would be effectively global
within the context of an instance?</p>
<p>But also, to my pleasant surprise, there were also some <a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L75-L81">global
constants</a>,
and they are marked as such, with
the <code>CONST</code> keyword. In fact, as we see in
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L418">multiple</a>
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L51-L55">places</a>,
QBasic is actually strongly typed, sometimes even using
the
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L117">sigils</a>,
for which the BASIC family is <a href="https://xkcd.com/1306/">infamous</a>.</p>
<p>The “B” in BASIC stands for “beginner,” and that is exactly the
target audience QBasic was designed for. So it’s really refreshing
that in the past they didn’t have this notion that types were
too advanced for novices, or perhaps too tedious, that an
<a href="https://www.python.org/">easy-to-learn programming language</a>
wouldn’t have you declare them.</p>
<p>Or, of course, maybe duck-typing was seen as too difficult or
inefficient to implement. But in that case, why did they have
what I imagine would be an equally difficult compromise measure,
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L20">alphabetically-based type
defaulting</a>.</p>
<p>To be fair, for a long time I had no idea what <code>DEFINT</code> even
meant, but <code>DEFINT A-Z</code> certainly seemed like an appropriately mysterious
and even badass way to start a subroutine, a magical invocation, covering
the ends of the alphabet to start off each page of code.</p>
<p>Obviously, QBasic is not object oriented. Its fundamental notion of
module isn’t a class, but rather a subroutine or function. These two
notions were distinct: functions returned values (like in math) –
though they could also have side effects – and subroutines did not.
(Both had <a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L58-L73">strongly-typed
arguments</a>,
however).</p>
<p>This might seem an odd distinction to make, but it makes sense at
a certain level. Especially syntactically, subroutine calls definitionally
must be the top-level construct of a statement. And lo and behold! –
they do not require parentheses
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L150">around their arguments</a> whereas
functions <a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L456">do</a>.</p>
<p>There’s really no reason not to do something like that in Rust, come
to think of it. And come to think of it, Haskell makes a vaguely similar
distinction, where if what others would call a “function” does IO
and does not take arguments, it’s not a function at all, but a
special value known as an “action,” which can then only be called
in certain contexts.</p>
<p>So what did I do with this? I added more action keys. I added
keys to speed up and slow down gameplay on command, so that if
you pressed the arrow in the direction the snake was currently
going, instead of doing nothing, it sped up the snake. Pressing
the opposite direction of where you were going would then slow
it down. And then, I wrote new levels, using the <a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L299-L411">existing levels
code</a>
as a baseline.</p>
<p>And then, after that, I began to attack the
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L414-L568">main subroutine</a>’s <a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L449-L541">main loop</a>. I thought it would be cool
if multiple numbers could be on the screen at the same time,
but this required modifying how the location of the numbers were
stored, replacing the
<a href="https://github.com/tangentstorm/tangentlabs/blob/master/qbasic/NIBBLES.BAS#L493">two
variables</a>
indicating their current location with a two-dimensional array of boolean
values (represented by <code>0</code> and <code>-1</code> – integer/boolean distinctions were
not yet well-established).</p>
<p>I wish I still had the code. But more importantly, I’m grateful that
the Microsoft of the ’90s, as evil and monopolistic as it was, saw
the need to put a programming language, a little IDE, and some sample
programs and include them with their operating system. Bill Gates was
my hero when I was a small child – before I knew what anti-trust was –
and the fact that Microsoft made sure that computers came with plenty
of fun corridors for me to explore was a huge part of why.</p>
<p>But also, there was no particular reason why the stuff I was doing
couldn’t be done by any other elementary schooler, if there were
interest in the schools in teaching it. Variables in programming
are far more concrete in their meaning than variables in algebra –
for one thing, their values actually vary with time, which made
me think the variables in algebra were a bit of a misnomer.</p>
<p>And yet, programming isn’t even a required course in most American
high schools. And that, I think, is a real shame. I understand that
most schools don’t have the resources to do a good job of it, and
that also, honestly, is a real shame.</p>
Warnings and Linter Errors: The Awkward Middle Childrenhttps://www.thecodedmessage.com/posts/warnings/2022-02-25T00:00:00+00:00What is “bad” Rust? When we say that a snippet of code is “bad” Rust, it’s ambiguous.
We might on the one hand mean that it is “invalid” Rust, like the following function (standing on its own in a module):
fn foo(bar: u32) -> u32 { bar + baz // but baz is never declared... } In this situation, a rule of the programming language has been violated. The compiler stops compiling and does not output a binary.<h2 id="what-is-bad-rust">What is “bad” Rust?</h2>
<p>When we say that a snippet of code is “bad” Rust, it’s ambiguous.</p>
<p>We might on the one hand mean that it is “invalid” Rust, like the
following function (standing on its own in a module):</p>
<pre tabindex="0"><code>fn foo(bar: u32) -> u32 {
bar + baz // but baz is never declared...
}
</code></pre><p>In this situation, a rule of the programming language has been violated.
The compiler stops compiling and does not output a binary. In fact, it
has to stop compiling, because this is not a Rust program. It might resemble
one, but it in fact does not make any sense, because it is violating
one of the extra-syntactic constraints that text has to have to be a Rust
program.</p>
<p>What would it even mean, to access a variable that’s not
declared? When you write a variable access, the compiler issues an
access to the corresponding register or location in memory. When
a variable is undeclared, no such location exists. The compiler
couldn’t compile this code if it wanted to!</p>
<p>On the other hand, there’s this sort of “bad Rust” as well:</p>
<pre tabindex="0"><code>fn foo(bar: bool) -> &'static str {
match bar == false {
true => "false",
false => "true",
}
}
</code></pre><p>This code is – as the kids say – cringe. Whatever this code is trying
to do, it should not be done this way. But for all its flaws, it’s definitely
“good” Rust in a validity sense: the compiler knows exactly what to do
to output a binary from it, and will do so with no complaints. Whatever
is “bad” about this code – and it’s a lot – is bad from a human
perspective only; the computer doesn’t even notice. It’s bad idiomatic Rust,
not erroneous invalid Rust, and it’s bad because humans prefer not to
structure their concepts this way.</p>
<p>So now we have a nice little dichotomy of problems with a Rust program.
On the one hand, we have errors, where the compiler will not – cannot,
even – produce an output. And on the other hand, we have idiomatic
failures. It’s a nice neat tidy distinction that a lot of people make,
but in the context of Rust – and with most programming languages –
it’s actually problematic, because problems with programs, like gender
or political views, don’t actually quite form a tidy binary. And as
with gender and politics, oversimplifying types of “bad Rust” into a binary,
even conceptually, can lead to practical problems.</p>
<p>I am, of course, talking about warnings and linter errors – those
rules that if you violate them, it won’t necessarily cause the compiler
to reject the program, but it may, depending on its settings. I’m also
talking about things like safety rules, where if you dereference pointers
the compiler will normally reject your program, but it can be told not
to on a block-by-block basis.</p>
<p>Here’s an example of that, for Rust:</p>
<pre tabindex="0"><code>fn foo() -> u32 {
return 3;
println!("Not reached!");
}
</code></pre><p>The compiler knows that that the <code>println</code> can’t be called,
and it makes a point to tell the user about it:</p>
<pre tabindex="0"><code>warning: unreachable statement
</code></pre><p>But more on those later. For right now, we’ll continue to try and brush
these warnings under the rug.</p>
<h2 id="the-binary-error-model">The Binary Error Model</h2>
<p>I call the philosophical framework I am criticizing the “binary error”
model, and before I start picking apart at it and denouncing it, I’d like
to spend some time explaining what I mean by it, and why it’s appealing.</p>
<p>So to talk about “the binary error model,” as I’ve termed it, we’ll
start by talking about why it exists, what problem it’s trying to solve.
It’s trying to distinguish between a notion of the programming language
in itself, as a platonic ideal almost, versus the other things that
surround it – like a reference implementation, or a set of community
norms. What would belong in a formal specification, and what not?
What would have to be the same for another compiler to also be a
Rust compiler?</p>
<p>In the “binary error model,” Rust, or any programming language, is a
set of valid programs and their semantics. You could look at it as being
analogous to a Rust function with this signature:</p>
<pre tabindex="0"><code>fn rust_programming_language(program: SourceTree) -> Option<Semantics>;
</code></pre><p><code>SourceTree</code> in this context is a directory hierarchy of properly
organized Rust code at some level of organization, maybe a crate.
<code>Semantics</code> is a little harder to define – it’s an abstract notion
of what the program “does,” a representation of the platonic essence
of what the program should output (meaning, in this context, any observable
behavior) given a set of inputs (meaning, in this context, any information
the program can observe).</p>
<p>So this definition is to say, the Rust programming language, in general,
can be thought of, philosophically, as a function from source trees to
specifications of concrete behavior. Since this isn’t an actual
Rust function, we can handwave those specifications a bit, and
discuss them in English or a formal model of our choice.</p>
<p>And this is a coherent way to talk about Rust, a philosophical abstraction
with practical applications. For example, if we were comparing two Rust
compilers, trying to find out if they implemented “the same programming
languages,” we could use this model as our criterion.</p>
<p>So, to find out whether two Rust implementations both implement the same
programming language, we use this function signature as our guide: Given
the same source tree, do they output programs with the same semantics,
the same concrete interaction with the outside world?</p>
<p>There are a lot of things that can be different between implementations:</p>
<ul>
<li>Do the programs, as compiled by these two different implementations, print out the same values when given the same inputs</li>
<li>Do the programs write the same data to the disk?</li>
<li>Do they panic in the same situations?</li>
<li>Do they have the same FFI characteristics to interact with a C library?</li>
<li>Do they have the same asymptotic complexity? (For a systems programming
language, we definitely want to include this under “semantics”)</li>
<li>Do they have the same memory model for internal inter-thread interactions?</li>
<li>Do they make the same safety guarantees?</li>
<li>Do they accept and reject the same set of programs?</li>
<li>Do they print the same exact error messages?</li>
<li>Do they issue warnings on the same set of programs?</li>
<li>Are the two compilers invoked by the same command?</li>
<li>Is one of the compilers actually an interpreter?</li>
<li>Do they target the same processor architecture?</li>
<li>Do they output the exact same binaries?</li>
<li>Do they run with exactly equal performance?</li>
</ul>
<p>Obviously, different implementations will differ in some of these ways.
But we do need some way of defining whether two compilers both
implement Rust, rather than one implementing Rust and one implementing Go,
or one implementing Rust and the other one not quite succeeding at
implementing Rust.</p>
<p>In the model, as we’ve defined it, the question comes down
to whether accepted programs have the same semantics (but not form)
and whether the set of accepted programs are the same. This means that,
of the above questions, they stop mattering after “do they accept and
reject the same set of programs?” That is where the binary error
model draws the line.</p>
<p>To apply this model, the relevant part of a compiler is that it implements
something like this:</p>
<pre tabindex="0"><code>fn rust_compiler(program: SourceTree) -> Option<CompiledProgram>;
</code></pre><p>And then, you could compare two compiled programs based on their
semantics.</p>
<p>This model could also be useful for writing a formal specification
of the Rust programming language (no, “the compiler itself” doesn’t count as
a specification), and for programming languages that have a formal,
written specification, it is couched in terms of something similar
to this model – but not necessarily exactly.</p>
<h2 id="warnings-and-errors">Warnings and Errors</h2>
<p>Let’s take another look at our abstract “function signature” for the
Rust programming language:</p>
<pre tabindex="0"><code>fn rust_programming_language(program: SourceTree) -> Option<Semantics>;
</code></pre><p>We have so far been glossing over a feature of the return type,
<code>Option</code>. But that is what makes this particular model
the “binary error” model, and that’s what I’m going to
be criticizing, so let’s discuss it now.</p>
<p>Some source trees are not Rust programs. Some are, in fact, Go programs,
or directories full of plain text files, or random binary data. Some,
on the other hand, are almost Rust programs, like the example from above:</p>
<pre tabindex="0"><code>fn foo(bar: u32) -> u32 {
bar + baz // but baz is never declared...
}
</code></pre><p>This model treats all of these programs equally. From the perspective
of this abstract function, these all return the same value, <code>None</code>.
Which means, from the perspective of this philosophical perspective,
all of these are the same: not a valid Rust program.</p>
<p>If we’re comparing two implementations of Rust, this model therefore
considers these statements to be irrelevancies:</p>
<ul>
<li>Do they generate the same error messages?</li>
<li>Are their error messages equally relevant to the problem?</li>
<li>Are their error messages equally comprehensible to a beginner programmer?</li>
</ul>
<p>These things, however, are still relevant:</p>
<ul>
<li>Do they reject the same source trees?</li>
</ul>
<p>In fact, a single program accepted by one and not by the other would
make these two compilers implementations of different programming languages.</p>
<p>And what about warnings? This abstract function signature barely has
room for errors, flattening them all to <code>None</code>. The complexities of
the ways in which a Rust program might be bad are simplified to a binary:
it is or is not a valid Rust program. Warnings are rounded to “it is valid.”</p>
<p>So in the “binary error” model, where the “return value” of the abstract
function for the programming language is just <code>Option<Semantics></code>,
this function falls into the “valid Rust” side of the binary:</p>
<pre tabindex="0"><code>fn foo() -> i32 {
let Foo = 3;
Foo
}
</code></pre><p>This is considered to be the case, even though the standard Rust
compiler outputs a warning for it:</p>
<pre tabindex="0"><code>warning: variable `Foo` should have a snake case name
--> test.rs:2:9
|
2 | let Foo = 3;
| ^^^ help: convert the identifier to snake case (notice the capitalization): `foo`
|
= note: `#[warn(non_snake_case)]` on by default
warning: 1 warning emitted
</code></pre><p>So what’s going on here?</p>
<p>Well, in point of fact, our compiler implementation does not implement
<code>Option<CompilerError></code> as its conceptual return value. Its contract
looks more like this:</p>
<pre tabindex="0"><code>fn rust_compiler(program: SourceTree) ->
(Result<CompiledProgram, Vec<ErrorMessage>>, Vec<Warning>);
</code></pre><p>But when we compare the compiler to other compilers in the “binary
error” model, we pretend instead the compiler was wrapped in this wrapper:</p>
<pre tabindex="0"><code>fn rust_compiler_for_comparison(program: SourceTree) -> Option<CompiledProgram> {
let res = rust_compiler(program);
let (res, _) = res; // strip warnings
res.ok() // flatten errors, did it compile or not?
}
</code></pre><p>In this model, only the parts that are part of our original
<code>rust_language</code> function truly are part of the Rust programming
language. Only the rules that would cause every hypothetical
compiler to reject the program are part of the programming language.
This warning is “just the compiler’s opinion, man.”</p>
<p>It’s as if the compiler had two jobs: compiling the Rust programming
language (defined as including a binary distinction between valid and
invalid programs) and separately a linter, which tells you the
compiler-writer’s opinions about what might be considered wrong with
the code.</p>
<p>And this is a self-consistent way to think about Rust and about
programming languages. It has practical applications: It gives you a
definition of when two compilers implement the “same” programming
language, and it allows you to define a formal specification for
Rust – or to imagine an abstract formal specification, if you so choose,
and use this notion to think about how your Rust code might fare under
alternative implementations of the programming language.</p>
<h2 id="alternatives-to-the-binary-error-model">Alternatives to the “Binary Error Model”</h2>
<p>There is no coherent way to say that this way of thinking about Rust is
wrong, per se. It is a philosophical perspective, a definition of what
concepts (like type safety) are part of the “programming language” and
the “programming language specification” (even if none has been written)
and what concepts are not, what concepts (like using snake case) are just
opinions and conventions outside of the scope of the programming language.</p>
<p>But on the other hand, we are not forced to assume this model. As it is
a definition of what is part of the “programming language,” we are free
to use a different operating definition. As it is a scope for what
goes in the “programming language specification,” the Rust community
is free to write a formal specification with different scope.</p>
<p>And I think we should, when that time comes, use a different scope.
I think that the people in charge of writing the spec come to it, they
will use a different scope rather than strictly following the definitions
explained here. Because even though the “binary error model” isn’t wrong,
per se, I think it is, nevertheless, harmful.</p>
<p>I not only think if a formal Rust specification is written, it should
not use this model. I think people should not assume this model. I
think it will lead to mistakes in your thinking. I also think that,
if you do assume this model specifically in Rust, you have to do a lot
more mental work that can be saved by asserting a different model.</p>
<p>So what’s the alternative? Well, our original definition of a programming
language did two things. It determined if the program was valid (a binary
up-down decision), and it mapped each valid program to its semantics.</p>
<p>An alternative model would not make validity so binary. If we do this
in the most straight-forward way, we get something like this:</p>
<pre tabindex="0"><code>fn rust_programming_language(program: SourceTree) ->
(Result<Semantics, Vec<Errors>>, Vec<Warning>);
</code></pre><p>This loses a few of the nice properties that we had in the previous
definition. “Valid Rust programs” is no longer a straight-forward set.
Instead, we have a potential multiplicity of sets distinguished by this
definition:</p>
<ul>
<li>Programs that compile</li>
<li>Programs that compile without warnings</li>
<li>Programs that compile without a specific warning we may care about</li>
<li>Programs that don’t compile but only have one error</li>
<li>Programs that don’t compile but only have one category of error</li>
</ul>
<p>Also, this definition imposes more on the writers of alternative
implementations. Suddenly, a compiler is only a valid Rust compiler
if it outputs the exact same list of errors and warnings, given an
input program.</p>
<p>This seems to me a little too strict. I don’t think the exact wording
of an error or warning should necessarily matter or be part of a programming
language spec. And compilers regularly stop compiling after experiencing
too many errors (where too many can sometimes be one), and implementations
would reasonably differ about <em>which</em> errors they would output before
giving up.</p>
<p>But I think it’s a good starting point, and in any case much better than
the binary-error <code>Option<Semantics></code> model. Part of the benefit of Rust
as a programming language is how much work has gone into its warnings.
For an alternative implementation to claim to be Rust without having the
same warning system would strike me as extremely misleading. Warnings –
obligatory warnings – should be included in any language spec.</p>
<p>Many important Rust safety features are actually warnings. Ignoring
<code>#[must_use]</code> is technically a warning – just set to <code>#[deny]</code>
by default. A function that has dead code after a <code>return</code> statement:
this is a warning, but also a serious correctness issue.</p>
<h2 id="rust-warnings-are-complicated">Rust Warnings are Complicated</h2>
<p>And of course any Rust implementation would have to include warnings.
Just as C (in practice) has a <code>#warning</code> directive,
which causes the compiler to issue warnings, Rust has a number of annotations
that control the issuance of warnings.</p>
<p>For example, if we add an annotation to our function from before:</p>
<pre tabindex="0"><code>#[deny(non_snake_case)]
fn foo() -> i32 {
let Foo = 3;
Foo
}
</code></pre><p>… the warning becomes an error:</p>
<pre tabindex="0"><code>error: variable `Foo` should have a snake case name
--> test.rs:3:9
|
3 | let Foo = 3;
| ^^^ help: convert the identifier to snake case (notice the capitalization): `foo`
|
note: the lint level is defined here
--> test.rs:1:8
|
1 | #[deny(non_snake_case)]
| ^^^^^^^^^^^^^^
error: aborting due to previous error
</code></pre><p>Any Rust specification, even one with the binary error model,
would therefore have to include:</p>
<ul>
<li>The rules about snake case (variables should have snake case)</li>
<li>The rules about annotations (so that <code>#[deny(...)]</code> triggers an error)</li>
</ul>
<p>This means that, even if we did imagine a specification where only
errors were in scope, rules for warnings would have to also be in that
specification, because they can be configured to become errors. And at
that point, why not also specify in the specification that the warnings
are obligatory?</p>
<p>Especially because we can also say <code>#[warn(...)]</code> as a tolerance level
for these configurable rules. What do we say about <code>#[warn(...)]</code>
in the spec if warnings are out of scope?</p>
<h2 id="the-other-side-of-the-binary">The Other Side of the Binary</h2>
<p>Now that I’ve criticized the “binary error” model from the warnings side,
I also want to address the notion that all errors are created equal.
Errors are different from each other.</p>
<p>First off, there’s an obvious distinction between syntax errors and
semantic errors. This is kind of boring and obvious, but it adds some
nuance into the idea that invalid Rust is simply “not Rust,” and
it comes up in practice sometimes.</p>
<p>As I write my code, I sometimes run <code>cargo fmt</code> as part of my editing
workflow. Usually, this helps me read my own code better for further
editing, and usually, this works even if my code is full of errors –
it might even help me find and understand the errors. But sometimes,
my code has a relatively superficial, syntactic error, like a missing
<code>}</code>, and <code>cargo fmt</code> can’t even help me. This sends me into a little
bit of a panic, but I’m usually also glad I didn’t keep working longer
with such a problem.</p>
<p>If a Rust specification wanted to include formatting tools in its scope,
it could conceivably make a formal distinction between syntax and semantic
errors.</p>
<p>More interesting, however, are errors that don’t have to be errors,
where the compiler could keep compiling, but it chooses not to.</p>
<p>We have the obvious example, where the error is configurable, where
it’s actually a warning that’s just been set to <code>#[deny(...)]</code> as a
lint level.</p>
<p>But we also have things like lifetime errors, which cannot be disabled. Or
the rule against dereferencing a pointer outside an <code>unsafe</code> block. The Rust
compiler could, if it wanted to, simply allow those things. We could
do something like:</p>
<pre tabindex="0"><code>#[unsafe_allow(lifetime_mismatches)]
</code></pre><p>The compiler would then output a program, which would then exhibit
undefined behavior – or not. It would then be potentially unsound –
or not.</p>
<p>This is not included in Rust, but it’s theoretically possible, unlike
referring to a variable that doesn’t exist, where there is no reasonable
interpretation of what the code should do.</p>
<p>On the border is things that C++ allows, but are arguably non-sensical
like referring to a variable that doesn’t exist. If a function returns
<code>u32</code>, and you reach the end of the function, that’s non-sensical, right?</p>
<pre tabindex="0"><code>fn foo() -> u32 { }
</code></pre><p>But depending on the ABI, you can just not output the code that sets
the return value, and perhaps not even output the code that returns from
the function. This is definitely undefined behavior, but C++ will often
allow it, sometimes without even a warning.</p>
<h2 id="unsafety-as-always-on-warnings">Unsafety as Always-On Warnings</h2>
<p>As an aside, the <code>#[allow]</code> and <code>#[deny]</code> annotations are very
similar to how <code>unsafe</code> works. We could imagine an alternative world
where there was no <code>unsafe</code> keyword for blocks. Instead of writing:</p>
<pre tabindex="0"><code>unsafe { *ptr }
</code></pre><p>… we could instead imagine a Rust where this is written as:</p>
<pre tabindex="0"><code>#[allow(unsafe)]
*ptr
</code></pre><p>Basically, using operations like dereference (<code>*ptr</code>) are
disallowed in Rust by default, but can be allowed. They are
disallowed because, like Rust that is warned about, they are
indications that the programmer likely made a mistake. But like
Rust that is warned about, the programmer can make explicit
that they are using the construct on purpose.</p>
<p>Given that <code>unsafe</code>/safety, one of Rust’s core features, works
in a way very similar to warnings, should make us take seriously
the importance of warnings. It would have been just as valid
from a safety point of view to use literally the same mechanism
with <code>#[allow]</code> and <code>#[deny]</code>, but I
think safety is such an important category of possible mistakes
it’s probably for the best that it has its own special syntax.</p>
<h2 id="take-aways">Take-Aways</h2>
<p>So why am I writing all of this, besides thinking it’s all an
interesting mental exercise?</p>
<p>I don’t think the authors of any future Rust spec would actually err in
such a way as to not discuss warnings at all. But I think it’s important
to understand the theoretical implications.</p>
<p>But I also know that people do think in terms of the hypothetical Rust
specification which only accepts or rejects programs. I recently saw
someone write that capitalization conventions, such as snake case, are not
part of the Rust programming language. They meant that according to the
“binary error” model that we discussed above, which they implicitly
subscribed to, using snake case or not will never change whether your
program is a valid Rust program, and therefore, the entire convention
is not part of Rust.</p>
<p>But even if we ignore the fact that a Rust compiler needs to know about this
convention in case <code>#[deny]</code> is used, this assumes a definition of
Rust programming language and Rust specification that uses the “binary
error” model.</p>
<p>And while that is one way to think about Rust, it’s not
a very good one, and I would say it’s not a very useful one.
And more fundamentally, you don’t have to. You don’t have to use
this philosophical framework where only rules that cause compilation
failures are part of the programming language.</p>
<p>So I don’t think it’s fair to say “in Haskell, variable name case is
significant, and in Rust, it is not, it is only a convention and not part
of the programming language.” I think it’s more fair to say “in Haskell,
case conventions are mandatory, violations are errors, and they are used
to disambiguate the syntax. In Rust, they are non-fatal warnings by
default, and the compiler can still process Rust with incorrect case,
and in some situations has to.” Or, more simply, “in Haskell, case
convention violations are errors, but in Rust, they’re just warnings.”
But, in both Haskell and Rust, capitalization conventions are part of
the programming language. In both, the compiler has to know about them,
and enforces them in at least some situations.</p>
<p>This may seem like a nitpick, but I think using definitions of
“programming language” vs “convention” can make the “convention” stuff
seem less important than it should be. I think that if you think that
way, and were writing an alternative implementation of Rust, you might
give yourself permission to not care about the warnings. You might be
less likely to add a policy to use <code>-Werror</code>, or require <code>clippy</code> to
pass in your CI.</p>
<p>If someone with that attitude were writing the language spec – which I
don’t think they would be, but if they were – they might underspecify
the things that make Rust the useful tool that it is. Programming is
about contracts, and as far as I’m concerned, the warnings are part of
the Rust compiler’s contract. And a compiler should not be allowed to
call itself a Rust compiler if it doesn’t follow it.</p>
<p>Snake case for variables is part of the Rust programming language, as I
define the Rust programming language, and – I think – as most of the
community defines it. Certainly it is part of “the Rust programming
language” as that phrase is used in common parlance, and it is one
of many features that make Rust special.
If there is to be a specification, it should
be part of the Rust specification. I understand that if you use the
“binary error” model to define what a programming language is, and what
a programming specification should be, you don’t get this result. But
I just don’t think you should be using that model, and I think it does
matter whether you do, even though it is a philosophical perspective
that cannot be disproven.</p>
<p>Of course, Rust will probably not ever have a single monolithic
specification mediated by ISO or an equivalent. It will certainly
continue to be a community organization, with many different standards
and specifications, perhaps one for a compiler with basic features,
another for a compiler that fully supports errors, another for formatters
like <code>cargo fmt</code>. Each of these specifications will delineate
different sets of source trees: source trees with the syntax of Rust,
source trees without errors, source trees without warnings, etc.</p>
<p>Just like the notion of a “programming language” doesn’t have to
be a single set, a single binary between valid and invalid, the notion
of a specification also needn’t be so monolithic.</p>
Haskell Error Messages: Come on!https://www.thecodedmessage.com/posts/haskell-gripe/2022-02-16T00:00:00+00:00I am a big fan of strongly typed languages, and my favorite GC’d language is Haskell. And I want you, the reader, to keep that in mind today. What I am writing is some commentary about a language I deeply love, some loving criticism.
So here’s what happened: A few days ago, I was showing off some Haskell for a friend who primarily programs in Python. The stakes were high – could I demonstrate that this strange language was worth some investigation?<p>I am a big fan of strongly typed languages, and my favorite GC’d language
is Haskell. And I want you, the reader, to keep that in mind today.
What I am writing is some commentary about a language I deeply love,
some loving criticism.</p>
<p>So here’s what happened: A few days ago, I was showing off some Haskell
for a friend who primarily programs in Python. The stakes were high
– could I demonstrate that this strange language was worth some
investigation?</p>
<p>My primary focus was on infinite lists, and defining <code>fibonacci</code> as a
<a href="https://wiki.haskell.org/The_Fibonacci_sequence#Using_the_infinite_list_of_Fibonacci_numbers">recursive data structure</a>
– all fun things to show off Haskell’s laziness.
But at some point, we wrote an expression by accident that had a type
error in it, and so we got to see how the compiler treated such things.
I don’t remember the exact expression – it was deep in context – but
the problem was I was trying to add an integer to an list. Something
analogous to <code>1+[2,3]</code>.</p>
<p>Now, in some <a href="https://en.wikipedia.org/wiki/JavaScript">“weakly typed” languages</a>,
<a href="https://xkcd.com/1537/">this sort of thing</a> is actually allowed, as
a colleague of mine recently pointed out:</p>
<pre tabindex="0"><code>[jim@palatinate:~]$ node
> 1+[2,3]
'12,3'
</code></pre><p>This is, of course, hilarious. But! We shouldn’t paint “weakly typed”
languages with such a broad brush. In my friend’s native Python, it
would have been an error, as it should be. It is a run-time error, but
what does that matter when you’re working in an interpreted language,
writing ad hoc scripts. The important thing is that failure is
recognized as failure, and it doesn’t try to
<a href="https://en.wikipedia.org/wiki/Fail-fast">continue with nonsense</a>:</p>
<pre tabindex="0"><code>[jim@palatinate:~]$ python3
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1+[2,3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'list'
</code></pre><p>This is an error message. It’s even a pretty decent error message.
There are many things you can pass to the <code>+</code> operator in Python,
but an <code>int</code> and a <code>list</code> together are not among them.</p>
<p>So now, what did Haskell do, this language that I’m trying to show off?
Well, unfortunately, my friend didn’t see the actual problem in the code,
but was first made aware of it from the compiler’s error message. And
if you’ve ever done this before in Haskell, you’re probably wincing right
now, because you know what this error message is:</p>
<pre tabindex="0"><code>[jim@palatinate:~]$ ghci
GHCi, version 8.6.5: http://www.haskell.org/ghc/ :? for help
Prelude> 1+[2,3]
<interactive>:1:1: error:
• Non type-variable argument in the constraint: Num [a]
(Use FlexibleContexts to permit this)
• When checking the inferred type
it :: forall a. (Num a, Num [a]) => [a]
</code></pre><p>Now, my friend didn’t understand this error message at all.
Since I was in Demonstration Mode, my instinct was to explain it to him,
but after a few false starts, I realized that this would simply not
help, and pointed out that you couldn’t add integers to lists,
and showed him where this was happening (it was a little more
subtle than this example).</p>
<p>But since then, my colleagues and I were discussing error messages in
Slack, specifically how good Rust’s error messages are, specifically
how much better they are than Haskell’s. So I had an opportunity to
paste that very bad Haskell error message me and my friend discovered
into the Slack. There, it served as a case study, so we could discuss how
problematically incomprehensible it is, sparking a lot of discussion, from
which I shall try to extract the most interesting parts into this post.</p>
<p>For one, this error message has little to do with the concrete
problem. The problem is – and the error message should say this – that
you can’t add lists. Specifically, in Haskell, you can only add things that
implement the <code>Num</code> typeclass (which lists don’t), and so you’d think the
compiler would be smart enough to mention <em>anywhere</em> in this error message
something along the lines of “expecting <code>[a]</code> to have <code>Num</code> instance,
but it does not.” That’s the <em>actual</em> problem, even if not well-explained.</p>
<p>But instead, <code>ghc</code> tries to assume you meant what you wrote, and figure
out a way in which <code>[a]</code> <em>can</em> have the <code>Num</code> instance. This is where
it fails, and then it gives advice on how to make <em>that</em> succeed.
As my professor-colleague points out, this is dangerous advice, especially
for beginners, because there’s no way that using <code>FlexibleContexts</code>
will actually help in that situation. The problem isn’t that these
lists aren’t numbers in particular, and that you need to only accept
lists that are numbers in your function. The problem is that no lists
are (or at least should be) numbers! But a beginner might just follow
the advice, try to figure out what the hell <code>FlexibleContexts</code> are,
and find themselves in a world of pain, and no closer to solving the
actual problem.</p>
<p>Part of what causes this is the type of <code>1</code> itself. Haskell, unlike
Rust, allows literals like <code>1</code> to be interpreted in any number type.
Given that Haskell (like Rust) has return-type polymorphism, it can directly
express this in the type system:</p>
<pre tabindex="0"><code>Prelude> :type 1
1 :: Num p => p
</code></pre><p>In Rust, this would be something like <code>impl Num</code>. It means that <code>1</code> can
be any type that is <code>Num</code>. Combine that with the fact that <code>+</code> requires
its arguments to be <code>Num</code> and to match (<code>(+) :: Num a => a -> a -> a</code>),
and when we see <code>1+[2,3]</code>, we’re simply left trying to figure out how
<code>[2,3]</code> is <code>Num</code>.</p>
<p>If we did not have this <em>polymorphic literal</em>, this notion that the
meaning of <code>1</code> is flexible, we would have seen a much more comprehensible
error message. If <code>1</code> meant the same thing as <code>(1::Integer)</code> (or any
arbitrary choice), we’d have this beautiful explanation:</p>
<pre tabindex="0"><code>Prelude> (1::Integer) + [2,3]
<interactive>:4:16: error:
• Couldn't match expected type ‘Integer’
with actual type ‘[Integer]’
• In the second argument of ‘(+)’, namely ‘[2, 3]’
In the expression: (1 :: Integer) + [2, 3]
In an equation for ‘it’: it = (1 :: Integer) + [2, 3]
</code></pre><p>Or even if we just had non-numbers on both sides, we’d similarly
have a better error message:</p>
<pre tabindex="0"><code>[jim@palatinate:~]$ ghci
GHCi, version 8.6.5: http://www.haskell.org/ghc/ :? for help
Prelude> () + [1,2]
<interactive>:1:6: error:
• Couldn't match expected type ‘()’ with actual type ‘[Integer]’
• In the second argument of ‘(+)’, namely ‘[1, 2]’
In the expression: () + [1, 2]
In an equation for ‘it’: it = () + [1, 2]
Prelude>
</code></pre><p>What is my take-away here? I don’t think the compiler has been sufficiently
tweaked when it comes to error messages, or that the Haskell community
cares sufficiently about beginners. Rust as a community
puts a lot of energy into good error messages, so that even though
Rust also has a trait you could add to arrays to make <code>+</code> work,
it still has a better error message:</p>
<pre tabindex="0"><code>error[E0277]: cannot add `[{integer}; 2]` to `{integer}`
--> test.rs:2:7
|
2 | 1 + [2,3];
| ^ no implementation for `{integer} + [{integer}; 2]`
|
= help: the trait `Add<[{integer}; 2]>` is not implemented for `{integer}`
</code></pre><p>But I also think the semantics of <code>1</code> are too liberal, leaving the compiler
in an awkward place. See, the weird thing is, you can declare <code>[2,3]</code>
a number, making <code>1+[2,3]</code> an expression that adds two lists:</p>
<pre tabindex="0"><code>instance Num [a] where
(+) = (<>)
(-) = (<>) -- Eh, why not?
(*) = (<>)
negate = reverse
abs = id
signum = const []
fromInteger i = take (fromInteger i) $ repeat undefined
main = do
print $ signum $ 1 + [2,3]
</code></pre><p>Once you’ve defined lists as a number, <code>1</code> is suddenly a list if
it wants to be. And this contributes to the difficulty of finding
the right error message: what you asked for is possible after all.</p>
<p>And in the end, this leaves me with the feeling that Haskell has
this in common with Javascript, and that makes me sad. A polymorphic
enough strongly typed language is no longer strongly typed.</p>
Mortgages are Interestinghttps://www.thecodedmessage.com/posts/mortgage_interest/2022-02-08T00:00:00+00:00I just bought a house, and it came with a mortgage. I bought the house and committed to the mortgage all in one ceremony, in a cute little office where I signed enough papers that the sellers were able to solemnly hand me the keys to my new castle. In the lead-up to this, I was told how early payments, mortgage insurance, and refinancing works, and it’s – I think very reasonably – been on my mind since.<p><img src="https://www.thecodedmessage.com/images/new_house.jpeg" alt="Mortgage Ceremony"></p>
<p>I just bought a house, and it came with a mortgage. I bought the
house and committed to the mortgage all in one ceremony, in a cute
little office where I signed enough papers that the sellers were
able to solemnly hand me the keys to my new castle. In the lead-up to
this, I was told how early payments, mortgage insurance, and refinancing
works, and it’s – I think very reasonably – been on my mind since.</p>
<p>It was in a university economics class – macroeconomics – where I
first encountered the concept that “paying down debt” was equivalent to
“saving.”</p>
<p>And of course, when economists – and especially undergraduate
econ professors – use words like “equivalent” you have to take it with
a heap of salt. But there’s a lot of truth to it!</p>
<p>Let’s say you get $1,000 from your job. If you pay down a debt with it,
your net worth has just increased by $1,000. If you save it into a savings
account, your net worth has also just increased by $1,000. But if instead
you <em>spend</em> it, say, on rent, your net worth does not increase at all.</p>
<p>It also interacts with interest in a similar way. Let’s say your debt
and your savings account are both with Easy Round Numbers Credit Union
(ERNCU), and both have 10% yearly interest as an annual percentage yield
(i.e. after compounding is already taken into account). Well, if you
save the $1000 now, you will have $1100 in a year. If you pay off $1000
in debt now, you’ll have $1100 less to pay in a year. If you were, for
whatever reason, planning on paying off the entire debt, whatever it was,
in a year’s time, this is an exactly identical situation. Even if not,
though, it’s still the same net worth, and macroeconomically “equivalent.”</p>
<p>Interest rate is important. To maximize long-term net worth, you would
want to put excess money in the place with the maximum interest rate.
This is common financial advice for loans: After paying all your minimum
payments, spend the extra money on the loan with the most interest.
And, though it’s less commonly brought up, it works for saving too: If
you save money in a place where it grows faster than the loan, put it
there. Keep that loan going, and pay it off only when you have to. So
if you have a savings account more valuable than the loan, use it.</p>
<p>So considering this “equivalency,” I perhaps have a valuable opportunity.
Instead of thinking of this as a loan, a burden to pay back, I should
think of this as a very special savings account, with 3.125% guaranteed
interest, more than the best CDs out there (which are currently hovering
around 1%ish). If I have some excess cash, I can use it to increase my
equity in the house, and I’ll save that much, plus 3.125% compounded,
in future payments.</p>
<p>“Wow,” I said to myself (and to a friend), “Maybe I could just take
advantage of this, and sell CD’s. I could have people lend me money –
sorry, <em>deposit</em> their money with me – at 3%, better than they’d be
able to get a bank, and use it to pay down the mortgage faster.
After all, saving and paying down debt is equivalent, right?”</p>
<p>And then I thought to myself, wait, how would I pay them back?
Let’s say I get a customer, John Doe. Mr Doe wants to leave his $1,000
with me, and get paid $1,030 in a year. That’s 3% interest, and less
than my 3.125% interest rate, so I should be able to make this worth
my while. And it’s better than what Mr Doe can get anywhere else, so
that should make it worth his while!</p>
<p>So I take his $1,000, and I pre-pay my mortgage with an extra $1,000
payment. My principal goes down by $1,000. The amount of interest I
have to pay in the following year, therfore, goes down by that times
my interest rate. I then pay $31.25 less in interest that year, take
the extra money I would have paid in interest, give $30 of it back to
Mr Doe, and keep the extra $1.25 as profit. Woo!</p>
<p>But already there’s a problem. It’s true that I would have to pay $31.25
less in interest that year, yes. But I can’t take that money and give it
back to Mr Doe. I have to keep paying the same total monthly payments,
and whatever now doesn’t go to interest, must go to paying the principal
down faster. I simply don’t have the flexibility to pay it to Mr Doe
(and myself) instead; I have to wait until the end of the mortgage term,
in 30 years, when I’ve paid it off early, to see the money.</p>
<p>But even setting that problem aside, I’m assuming that once I give
Mr Doe the interest, he’ll want to leave the $1,000 in for another year.
But what if he wants his money back? What if inflation continues or the
Fed just feels like raising interest rates, and now there’s CD’s at 4%
available elsewhere? Well, in order to get the principal back (and
this would help with the interest too), I’d have to increase the loan
amount on my house back up again, and take-backsies the pre-payment.
Given that this is a fixed-rate mortgage and not a line of credit, this
would require refinancing – which, given that in this scenario interest
rates are higher, is not going to be fun or affordable for me.</p>
<p>So what financial product can I sell Mr Doe and his friends? I can sell
a very long-term CD, that matures in 30 years. Once his money gets used
as my early repayment, he has to wait until I see it again – when my
mortgage ends early because of it. Then, instead of paying the bank back
for the last however-many payments, I pay him the amount I would have
paid the bank, and he should get his money back with 3.125% interest –
or rather, with 3% interest after I also take out my cut. Mr Doe pays
$1000 now, and I pay him $2427.26 after 30 years, for $89.94 profit on
my part.</p>
<p>So, it looks like having a loan at 3.125% is indeed like having a savings
opportunity at 3.125% – as long as it’s a 30-year CD. Problem is,
who wants a 30-year CD? Would I buy a 30-year CD at 3.125%? I mean,
sure, the interest is enticing, and much better than I can get in a
savings account.</p>
<p>But at 30 years, the only thing I could practically use this CD for is
retirement – and I already have a retirement account, which I expect to
do better tha 3.125% a year. The 3.125% is a guarateed return, that is
true, but 30 years is a long way away. On average, the stock market,
while not guaranteed, has
<a href="https://www.nerdwallet.com/article/investing/average-stock-market-return">historically tended to pay 10%
returns</a>.
While it might result in bad years, 30 years is plenty of time for the
bad years to be beaten out by the good ones.</p>
<p>And it’s not like my monthly payments would go down if I created such a
CD – that would require a refinancing. The bank is actually creating CDs
(and other financial assets) out of this mortgage, and in exchange for
predictability, they don’t want to lower the monthly payments, which is in
my mind fair enough, as I get to lock in this low interest rate. But even
if I pay half the mortgage in a single payment, unless I refinance, my
monthly minimum payment remains the same until the mortgage is paid off.</p>
<p>And honestly, some borrowers – those with certain subprime mortgages –
have it worse. Some mortgages come with a pre-payment penalty, where
paying the principal ahead of time actually accrues fees. For me, it
does decrease the principal – I just have to wait until the end of the
mortgage to reap the practical benefit.</p>
<p>Again, unless I refinance. I pay the entire loan off, all at once, and
replace it with another loan. This makes sense if the interest rate is
lower, or even if it’s the same and I want to lower the monthly payment
or take money out of my equity. If I could know I could refinance at any
time at the same interest rate, then I could sell (and provide myself
with) shorter-term savings opportunities.</p>
<p>Refinancing is similar to selling a mortgage-sized group of CDs to a bunch
of interested John Does… except it’s to a bank, the only organization
actually interested in such a product, and therefore at the market rate
for mortgages, rather than at the lower market rate for CDs. Part of what
makes a bank a viable business is that they provide longer-term loans than
any normal individual would be interested in, in exchange for an upcharge
on interest rates. They then take on some amount of risk and some amount
of overhead to turn that loan into more reasonable (i.e. shorter-term
than 30 year) CDs, or even checking and savings accounts.</p>
<p>So if a mortgage doesn’t give me a super-special savings opportunity
for my excess money, why am I getting a mortgage? Well, I have to live
somewhere, and I can’t afford a house outright, and the alternative
is renting. Monthly mortgage payments are comparable, in their size,
with rent payments, but even with the default monthly mortgage payment,
the principal’s still going down.</p>
<p>Put another way: Even though I’m not interested in putting my disposable
income into a 30-year 3.125% CD, I’d much rather put my monthly housing
expenses into such a CD than pay them to some landlord, never to be seen
again. This adds up – in 30 years, I will definitely have the entire
principal of my mortgage paid off, which means I will no longer have to
pay the mortgage payments. If things go well, my house will have also
appreciated, in case I want to borrow money against home equity. If I’d
rented this entire time, I’d still have to pay rent, which the landlord
could increase at any time. The “free rent” from my mortgage will be a
valuable asset in my later years, and one that I basically don’t have
to pay any extra for above getting a rental.</p>
<p>And mortgages are honestly also convenient. If I owned the property
outright, I’d have to pay all the different forms of property tax myself,
and also my homeowner’s insurance. With a mortgage, I have one monthly
payment, that indirectly pays those things through an escrow account.
And this is honestly much more convenient, especially for someone who
struggles to juggle all the demands of modern life.</p>
Burying the Ledehttps://www.thecodedmessage.com/posts/buried-lede/2022-02-02T00:00:00+00:00Imagine you don’t know who Napoleon was. You know he’s a figure from history, but you don’t even know he has to do with France. And imagine, when you read the Wikipedia article, for some reason you skip the opening paragraphs above the fold, and you’re reading about his upbringing in Corsica as a petty Italian noble under French rule. And you just want to know, why’s this guy important, what’s his deal, why do people keep talking about him (something military, it seems?<p><img src="https://www.thecodedmessage.com/images/napoleon.jpeg" alt="The Emperor Napoleon in His Study at the Tuileries - Jacques-Louis David"></p>
<p>Imagine you don’t know who Napoleon was. You know he’s a figure from
history, but you don’t even know he has to do with France. And imagine,
when you read the Wikipedia article, for some reason you skip the opening
paragraphs above the fold, and you’re reading about his upbringing
in Corsica as a petty Italian noble under French rule. And you just
want to know, why’s this guy important, what’s his deal, why do people
keep talking about him (something military, it seems?) but you have
to read two-thirds of the way through the article to find out, oh, he
became <em>Emperor of the French</em>. Finally, you have context to understand
everything else, and you now know the first thing about Napoleon.</p>
<p>This is how I feel reading technical documentation, not all the time,
but a fair amount of the time. I hear about a new project, and I get
documentation about it, and it immediately goes into the weeds.
What is a Foo? Is it a library, an application, a command-line utility?
It’s a “framework” – that word means several things, but I’m guessing
it means it’s a library that dictates how an application is written?
Who would use this framework? What problems does it solve for them?
Oh, I see it uses a bunch of technologies to solve the problem,
but what problem is it trying to solve, exactly?</p>
<p>Tech writing isn’t the only time this comes up, of course. The phrase
“burying the lede” is famous from journalism, the canonical example being
an article about a fire, where the article goes into details about how
it might have been started, it talks about how this is the third fire
this year, does a brief profile about the firefighter who put it out,
and then finally, at the end of the article, mentions with gravity and
somber respect, the two children who were so unfortunately lost, burnt to
a crisp. Finally, by the end of the article, the studious reader knows
why it’s the talk of the town, and the other reader, who only read the
first two paragraphs, will say something insensitive at the local pub
that night and look like quite the asshole.</p>
<p>For another example: I remember being a high schooler in the early aughts,
and hearing, as was often discussed then, about Halliburton. What,
exactly, was Halliburton and what do they do as a business? Democratic
Congressmen complained about them so much, but what, actually, was the
company, and what business did they have in Iraq? Literally, what were
they doing there? As in, what were they doing while they were there?</p>
<p>And so, a friend and I decided we’d look at the Halliburton website.
And besides an option to sign up to pay $300 to be some sort of member
of something, and a careers page that shined no light on our question,
there simply wasn’t very much content to go off of. Lots of testimonials
and statements about how they were simply the best in the business – and
clearly it had something to do with oil, though this had to be gleaned
as even that was not explicitly stated – and nothing, absolutely nothing,
about what exactly the business was. My friend said they did “consulting”
on oil fields, but I didn’t know what that meant then, and even now,
“consulting” is one of my least favorite terms on account of how vague
it is.</p>
<p>Of course, what we should have done is gone to Wikipedia. I don’t
know what it said then, but <a href="https://en.wikipedia.org/wiki/Halliburton">now</a>
it has a pretty darn helpful second sentence, opening with:</p>
<blockquote>
<p>Halliburton Company is an American multinational corporation. In 2009,
it was the world’s second largest oil field service company. It has
operations in more than 70 countries</p>
</blockquote>
<p>“Oil field service” is much more evocative and specific than anything
it said on the Halliburton website at the time (or even now). And it
also is a link, to an
<a href="https://en.wikipedia.org/wiki/List_of_oilfield_service_companies">list</a>
of such companies, which doesn’t explain much about them but at least
tells us the first thing:</p>
<blockquote>
<p>This is a list of oilfield service companies – notable companies that
provide services to the petroleum exploration and production industry
but do not typically produce petroleum.</p>
</blockquote>
<p>Of course, it’s still unclear what those services actually <em>are</em>. Do
they build equipment, install equipment, run trainings for the
people who actually drill it? Do they help staff the oil fields?
Lots of actual questions. But I’m a lot further ahead than I would’ve
been just looking at the Halliburton website. Not very much, but a
lot further.</p>
<p>Actually, if anyone in my readership can give me any insight on what
Halliburton actually, specifically does, please let me know in the
comments. Thank you for indulging me in this anecdote about the petty
frustrations of my youth.</p>
<p>Now back to the topic.</p>
<p>Let’s look at a better, cleaner example of a lede, what
Wikipedia <a href="https://en.wikipedia.org/wiki/Napoleon">actually has to say</a>
about Emperor Napoleon in the very first paragraph:</p>
<blockquote>
<p>Napoleon Bonaparte (born Napoleone di Buonaparte; 15 August 1769 – 5 May
1821) was a French military and political leader who rose to prominence
during the French Revolution and led several successful campaigns during
the Revolutionary Wars. He was the de facto leader of the French Republic
as First Consul from 1799 to 1804. As Napoleon I, he was Emperor of
the French from 1804 until 1814 and again in 1815. Napoleon dominated
European and global affairs for more than a decade while leading France
against a series of coalitions in the Napoleonic Wars. He won most of
these wars and the vast majority of his battles, building a large empire
that ruled over continental Europe before its final collapse in 1815. He
was one of the greatest military commanders in history, and his wars
and campaigns are studied in military schools worldwide. Napoleon’s
political and cultural legacy has endured, and he has been one of the
most celebrated and controversial leaders in world history.</p>
</blockquote>
<p>That’s a lot to unpack, but I think it’s a fairly solid intro
paragraph! For the record, I did not choose it because I knew
ahead of time that it was going to be a solid intro paragraph.
I just happened to be thinking about Napoleon when I started this
essay, and I trusted Wikipedia to provide a solid introduction to
such a historically important figure, and Wikipedia did not
disappoint.</p>
<p>Which is all to say, this solid quality is pretty standard on
Wikipedia. How do they do it?</p>
<p>It turns out they have some <a href="https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Lead_section">pretty nice
policies</a>,
specifically about the <a href="https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Lead_section#First_sentence">first sentence</a>.</p>
<p>I always appreciate how Wikipedia starts out biographies by saying
the person’s nationality and what jobs or activities they were
known for. Napoleon was a French military and political leader.
<a href="https://en.wikipedia.org/wiki/Antonio_Gramsci">Gramsci</a> was “an Italian
Marxist philosopher, journalist, linguist, writer, and politician.”
This is especially important, because other sources are likely to
assume I know who a famous person is, especially actors and athletes,
neither of whom I’m likely to recognize the name of. You can’t assume
the audience already knows Napoleon was Emperor of the French – or that
Robin Williams was an actor – especially if you’re an encyclopedia! If
you can’t learn such things reading an encyclopedia, where can you?</p>
<p>I also appreciate that Wikipedia explicitly highlights the level of
importance of the figure. It’s not hype when Wikipedia says that Kafka
“widely regarded as one of the major figures of 20th-century literature”
or that “Napoleon dominated European and global affairs for more than
a decade while leading France against a series of coalitions in the
Napoleonic Wars”; in both cases, it’s something that readers really
ought to know.</p>
<p>All in all, though, this comes down to having a taste for saying
the otherwise obvious. The editors who wrote these lead paragraphs
knew who Kafka and Napoleon and Gramsci were, had a lot more expertise
than the people who’d need these basic facts. But with effort, with the
aid of explicit lists and examples of other articles, they were able
to communicate to the less knowledgeable effectively.</p>
<p>Go and do likewise! Do likewise in your documentation. Explain to me
what your programming project does, who it’s for, what its role is
within your company. In the comments/developer docs: Tell me what the
most important files are, where the main loop is, where you go to
modify various things.</p>
<p>And for goodness sake, tell me what problem the darn thing is
trying to solve, and what type of thing it is (e.g. program, library,
framework, specification, protocol).</p>
Being Fair about Memory Safety and Performancehttps://www.thecodedmessage.com/posts/unsafe/2022-01-20T00:00:00+00:00For this next iteration in my series comparing Rust to C++, I want to talk about something I’ve been avoiding so far: memory safety. I’ve been avoiding this topic so far because I think it is the most discussed difference between C++ and Rust, and therefore I felt I’d have relatively little to add to the conversation. I’ve also been avoiding it because I wanted to draw attention to all the other little ways in which Rust is a better-designed programming language, to say that even if you concede to the C++ people that Rust isn’t “truly memory safe” or “memory safe enough,” Rust still wins.<p>For this next iteration in my <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">series</a> comparing
Rust to C++, I want to talk about something I’ve been avoiding so far:
memory safety. I’ve been avoiding this topic so far because I think it
is the most discussed difference between C++ and Rust, and therefore I
felt I’d have relatively little to add to the conversation. I’ve also
been avoiding it because I wanted to draw attention to all the other
little ways in which Rust is a better-designed programming language,
to say that even if you concede to the C++ people that Rust isn’t “truly
memory safe” or “memory safe enough,” Rust still wins.</p>
<h2 id="array-indexing">Array Indexing</h2>
<p>But there is a persistent and persnickety little argument that I wanted to
talk specifically about. This argument is really persuasive on its face,
and so I think it deserves some attention – especially since I am guilty
of having used this argument myself, many years ago when I still worked
at an HFT firm, to claim that C++ had a niche that Rust wasn’t ready for.
I’ve also seen it a few times in a row in the wild, and it’s made me
so emotional that I simply had to write this, and as a result, it’s
a little more emotional than some of the other posts.</p>
<p>In this argument, array indexing stands in for a number of little
features. But – I’ve seen array indexing cited so often as a canonical
example that I feel compelled to address it directly!</p>
<p>The argument goes like this: In Rust, array accesses are checked. Every
time you write <code>arr[i]</code>, there is an extra prepended
<code>if i >= arr.len() { panic!(..) }</code>. As you can see, that is more code,
and worse, a run-time check. And while the optimizer might eliminate
it, or the branch predictor may well predict it right every time,
the extra code bloat and possible run-time check, is just
unacceptable in [insert field here (I used HFT)], where every
nanosecond matters. And until some acceptable solution is found to this,
I just don’t see Rust making it in [insert field].</p>
<p>When I made this argument, to a group of programming-language academics,
the defenders of Rust countered with a number of points, all of which
accepted the basic premise:</p>
<ul>
<li>Do I really need those extra nanoseconds? Yes.</li>
<li>Is it really too much of a price to pay for all that extra
safety? Yes.</li>
<li>Do I really distrust the optimizer that much? Yes. If only
Rust had a way to do optimizer assertions, a way to
statically verify that the <a href="https://github.com/dtolnay/no-panic">panic had been optimized
out</a>.</li>
<li>Would dependent typing on integer values help? Yes. That sounds
very promising. I think Rust will get there someday, but for right
now we must use C++.</li>
</ul>
<p>Now that I know more about Rust I’m happy to tell you that I was
completely off base. I wasn’t off base about the performance considerations,
or the unacceptability of even the slightest risk of a run-time check.
I was off base about an even more basic premise: that Rust uses checked
array indexing, whereas C++ uses unchecked array indexing.</p>
<p>But wait! Isn’t that the whole point? Doesn’t C++ avoid checking everything,
to make sure all abstractions are zero-cost, to be blazing fast? Doesn’t
Rust, while trying for performance, in the end always concede to the
demands of safety?</p>
<p>Well, let’s look at the APIs in question. C++ apologists are always
saying to use the modern C++ features from C++11 and later,
rather than the more C-like “old style” C++ features, so on the
C++ side let’s take a look at the
<a href="https://en.cppreference.com/w/cpp/container/array">documentation</a>
for <code>std::array</code>, introduced in C++11.</p>
<p>Here we see two indexing methods. The first one, <code>at</code>, is bounds
checked and will throw an exception if the index is out of bounds,
whereas the second one, <code>operator[]</code>, is not, and will instead exhibit
undefined behavior of a very difficult-to-debug nature. It looks like C++
actually believes in free choice here, leaving the choice of method up
to the user. Not quite what we supposed, but the important part is that
unchecked indexing is available, so so far the argument can still stand.</p>
<p>Now let’s look at Rust. Rust arrays and vectors can also be used with
methods from <a href="https://doc.rust-lang.org/std/primitive.slice.html">slice</a>,
as can slices, so the slice documentation is the best place to look.
And looking there, we immediately see – drum roll please – 4 methods. We
see <code>get</code> and <code>get_mut</code>, which are checked, and right underneath them,
in alphabetical order, <code>get_unchecked</code> and <code>get_unchecked_mut</code>, which
are not.</p>
<p>To review, where do Rust and C++, these programming languages with
their vastly different philosophies, Rust for the cautious, C++
for the fast and bold, stand? In the exact same place. Both programming
languages have both checked and unchecked indexing.</p>
<p>Let me say that again. This is the talking point form, what to say if you
need something quick to say, if you’re ever debating programming languages
on a political-style talk show (or at a party or even a job interview):</p>
<blockquote>
<p>In both Rust and C++, there is a method for checked array indexing,
and a method for unchecked array indexing. The languages actually
agree on this issue. They only disagree about which version gets to
be spelled with brackets.</p>
</blockquote>
<p>The difference is simply in the default, which one gets
that old fashioned <code>arr[index]</code> syntax. And even that <a href="https://docs.rs/unchecked-index/latest/unchecked_index/">can be
changed</a>.
Even if the C++ default were superior – and, as I will argue later,
it is not – this is surely a minor issue. After all, don’t we normally
use our fancy <code>for x in arr</code> syntax in Rust? This issue is just so small
as to be unlikely to be a deciding factor in what programming language
is better, even if we’re in a special application domain where every
nanosecond matters.</p>
<h2 id="the-unsafe-keyword">The Unsafe Keyword</h2>
<p>So that’s a wrap folks. We can all go home, and none of
us will ever see this extremely silly argument on the
Internet or in person again. It’s just a misunderstanding,
the person making it was simply misinformed, and all it will
take is a link to this blog post – or the <a href="https://doc.rust-lang.org/std/primitive.slice.html#method.get_unchecked">relevant method in the
docs</a>
to set them straight.</p>
<p>But wait! The C++ apologists are still talking! What are they saying?
How have they not been completely flummoxed? They’re pointing
at that method, chanting a word like a slogan at a protest march.
I can’t quite make it out – what it is it?</p>
<p>Oh. They’re chanting <code>unsafe</code>. And credit where credit is due:
it’s very difficult to chant in a monospace font.</p>
<p>Well, that is easy to respond with! The nerve, that C++ programmers would
call our unchecked array indexing method unsafe. For one, all unchecked
array indexing methods are unsafe: that’s what unchecked means. If it
were safe, it would be at least statically checked. For another, isn’t
this the pot calling the kettle black? Isn’t C++ all about unsafety,
so much that C++ programmers don’t even mark their unsafe code regions
becasue it all is, or their unsafe functions because they all are?</p>
<p>“But isn’t that the whole point of Rust?” they cry. “If you have to
use <code>unsafe</code> to write good Rust, then Rust isn’t a safe language
after all! It’s a cute effort, but it’s failing at its purpose!
Might as well use C++ like a <a href="http://www.catb.org/jargon/html/R/Real-Programmer.html">Real Programmer</a>!”</p>
<p>This, my friends, is a
<a href="https://www.smbc-comics.com/comic/logical-fallacies">straw</a>
<a href="https://www.smbc-comics.com/comic/straw-men">man</a>. No, the point
of Rust and specifically Rust’s memory safety features is not
to create an entirely safe programming language that can’t be
circumvented in any circumstance; you must be thinking of Sing#,
the programming language for Microsoft’s defunct <a href="https://www.microsoft.com/en-us/research/project/singularity/">research
OS</a>.</p>
<p>Let me be abundantly clear: The point of memory safety, the unsafe
keyword, and friends in Rust is not to completely enforce memory safety,
to make it impossible for the programmer to do anything they want to
with the computer, even if they can’t prove to the compiler that it’s OK.
In fact, the point of memory safety isn’t to make it <em>impossible</em> to do
anything at all – it’s to make it <em>possible</em> to reason about the program.</p>
<p>The premise of Rust is that the vast majority of code in a systems program
doesn’t need to be unsafe, and so it might as well be safe. People used
to believe that you needed garbage collection for safety, but Rust
proved that you could use lifetimes to still get safety without that
performance cost. Now that we’re there, why worry about null pointers?
Why not tell the compiler which things can be null, and which things
can’t, so the compiler can check for you whether you’re handling nulls
correctly? I’ve programmed C++ professionally for years without such a
feature. You’d better believe I would have totally annotated the crap
out of the code so the compiler could’ve caught them ahead of time.</p>
<p>Sometimes, C++ apologists cite valgrind. I’ve had codebases where
I tried to use <code>valgrind</code>. Unfortunately, there was so much undefined
behavior and memory leaks already caked into this project that new
ones were simply impossible to see among all the noise. An army
of junior engineers was at some point required to clean this up
when finally the hierarcy decided that “valgrind” was something we
might want to be able to use in the future.</p>
<p>And a lot of those undefined behaviors were ticking time bombs.
Certainly, this codebase had its issues. A friend of mine took days to
find a bug where a pointer had a value of 7. I don’t mean 7 elements into
some array, not 7 of the relatively wide pointer type, not a convenient,
testable-for <code>NULL</code>, value. No, none of that: The pointer’s value was
exactly <code>0x7</code>.</p>
<blockquote>
<p><strong>Update</strong>: My friend had a very similar incident to that described
in <a href="https://scholar.harvard.edu/files/mickens/files/thenightwatch.pdf">this piece</a>,
but it was not the same incident. Some time after, I read that
piece and shared it with this friend … and I must have conflated
the numbers from the piece and from what happened to my friend.
It was some null-page number, some “low integer,” however, even
if not <code>0x7</code>.</p>
</blockquote>
<p>I’ve had memory corruption issues where I poured over every line of code
that I wrote, over and over again, finding nothing. Ultimately, I learned
that the issue was in framework code – code written by my boss’s boss.
The code was untested, and written extremely poorly, and had rotted, so
that it didn’t work at all. In Rust, I might have had some idea that
my code – which in Rust would have all been able to be “safe” –
couldn’t possibly be the source of the problem. Maybe my humble assumption
that my code was to blame would be a little less tenable.</p>
<p>If I wanted a language that was always safe, at the time I knew Java
or Python existed. Some companies even do finance in Java, for exactly
that reason. But sometimes you still need that extra bit of performance.
<code>unsafe</code> is sometimes necessary.</p>
<p>But given what gains safe Rust has made in predictable performance,
it’s not as necessary as it used to be. The majority of the code I
wrote then could’ve been written in safe Rust, and not lost a single
clock cycle. The parts that needed to be unsafe could have been
isolated, delegated to specific sections, wrapped in
abstract data types, perhaps entrusted to a specific team.</p>
<p>And even then, I’m sure we would have been debugging memory
corruption issues. But we’d know where to look. We’d know where to
throw the tests. And we’d have saved programmer-years of time,
days if not months of my life.</p>
<p>Now, I’m proud of my C++ skills. There is some part of me that wishes
that C++ was better than Rust, that all that time getting better at
debugging memory corruption wasn’t dedicated to a skill that is
becoming obsolescent through better technology. And to be honest, that’s
part of why I dismissed Rust as a candidate for HFT programming
languages.</p>
<p>But it’s possible to be proud of a skill that is also becoming obsolete.
And I am trying to replace it with a new skill to be proud of – writing
Rust as performant as idiomatic C++, or even more performant, while
reaching for the <code>unsafe</code> keyword rarely and modularly. I think it’s truly
possible, for where it’s relevant.</p>
<p>Now I must turn to a subset of C++ apologists, who write using “modern
C++” which is “very safe now” and experience therefore no memory corruption
issues. To them I say, you are not doing high performance programming.
If you were, you’d have to do some wonky things with pointers to spell
the bespoke high-performance constructs you’d need.</p>
<p>There is indeed a safe subset of C++ heavy with modern features. If
you are disciplined and keep your programming in that realm, you can
avoid memory corruption mostly. But first, this safe subset covers fewer
high-performance features than Rust. I’ve read some of this code and its
idioms: It’s full of <code>shared_ptr</code>s not to share ownership but simply to
avoid types that might be invalidated. It ironically leans on reference
counting more than idiomatic Rust. This is among other, similar problems.</p>
<p>Let me be clear: First off, instead of keeping in your brain which
features are “modern” and which are “edgy,” why not have a distinction where
it’s well-marked? Second off, if you are writing entirely in this safe
subset of C++, you can get much better performance instead out of the
safe subset of Rust. You have no right to complain about Rust’s safety
trade-offs, as you’re using a worse set, where you get no safety
promises from the compiler and none of Rust’s surprising safe performance.</p>
<p>Rust’s safe and “slow” subset is faster than C++’s while still being,
obviously, safer. Rust’s unsafe subset is better factored and better
distinguished. Comparing apples to apples, Rust is better programming
language for extracting performance out of LLVM, because you’ll be able
to code more often without fear, and with very focussed fear when you
do feel it.</p>
<p>A tool is even more useful if you can adjust it. The defenders
of C++ talk about choosing trade-offs, but really, Rust offers both
trade-offs. Mark your code as <code>unsafe</code> and convince yourself of its
safety manually, or rely on programming language features. It’s up to
you, on a function-by-function, even block-by-block, basis. In C++,
if you have a problem, every line of code is suspect; you simply
can’t opt in to safety, but in Rust, for where you don’t need the
performance of unchecked indexing and other unsafe features, you can
relax about the possibility of going <a href="https://en.wikipedia.org/wiki/Knight_Capital_Group">bankrupt due to inadvertent memory
reinterpretation</a> –
and how do I wish my NDA permitted me to talk about consequences at my own
previous jobs!</p>
<p>And for where you do need to use <code>unsafe</code>, you can make sure your
debugging and overthinking efforts are well-directed, for the few places
in a large project you need it.</p>
<h2 id="unchecked-indices">Unchecked Indices</h2>
<p>This has gotten a little far from the original question. Should array
indices be checked? Well, let me be clear about two facts that are both true,
but in tension with each other:</p>
<ul>
<li>Unchecked array indexing is sometimes absolutely necessary</li>
<li>Unchecked array indexing is an edge-case feature, which you
normally don’t want.</li>
</ul>
<p>If unchecked array indexing was unavailable in Rust, that would be a bug.
What is not a bug is making it inconvenient. C++ programmers probably
should be using <code>at</code> instead of <code>operator[]</code> more often. But in C++,
what would it gain? There’s so many unsafe features, what’s the cost
of one more?</p>
<p>But in Rust, where so much code can be written that’s completely safe,
defaulting to the safe version makes more sense. Lack of safety is a cost
too, and Rust makes that cost explicit. Isn’t that the goal of C++, making
costs explicit?</p>
<p>Let’s look at situations where you are indexing memory. First off, most
of them I saw were in old C-style <code>for</code>-loops, where you loop over an
index rather than using iterators directly with a collection. Both Rust
and C++ have safe versions of <code>for</code> that loop over collections with
iterators, and those use the same check for the loop as they do for
bounds, so those are easy enough to address. Nevertheless, I think that
a lot of the noise about checked vs. unchecked array accesses comes from
people who use indexing for their <code>for</code>-loops instead of iterators,
and therefore mistakenly think that array indexing in general is a
far more common operation than it is.</p>
<p>For the remaining situations, most are implementing either gnarly
business logic, or a subtle, fast algorithm.</p>
<p>If it’s gnarly business logic, in my experience, it’s usually at config
time – along with a good third to half to even more of the code in a
complicated production system.</p>
<p>What do I mean by config time? A running high-performance system, whether
optimized for latency or throughput, has a bunch of data structures
organized just so, a lot of threads set up just right to move data
between them in the perfect rhythm, and a lot of the work is in arranging
them. That work is generally not performance-sensitive, but often has
to be in the same programming language as the performance-intensive stuff.</p>
<p>Config-time is, depending on how you look at it, less of a thing or the
entire thing in a programming language like Python. Python basically
exists to do config-time programming for performance-intensive code put
in very comprehensive “libraries” written in C or C++. But in C++, where
you have a constructor that runs only once or a few times at first,
and other methods related to it, in the same programming language as the
money-making do-it part, you have to really adjust programming style
between them.</p>
<p>Config-time is obviously when you read the configuration files.
It’s where you open the relevant files. It’s where you call <code>socket</code>
and <code>bind</code> and <code>listen</code> on your listening port. It’s where you spin up
your worker threads, and make computations on how many worker threads
there are. It’s where you construct your objects and your object pools.
It’s where you memory map your log file. It’s where you set your process
priorities. It’s where you recursively call the constructors and <code>init</code>
functions of every object in your overwrought OOP hierarchy.</p>
<p>There is no need to sacrifice safety for performance at
config time – especially since undefined behavior might lie latent and
destabilize the system once it’s actually up and running. If you do
an unchecked array access at config time, you might put garbage data in
an important field, maybe one that determines how much money you’re willing
to risk that day or how many of a thing to buy. And for what? To save a few
nanoseconds before your process has even “gone live”?</p>
<p>So, when do you truly need unchecked array accesses? If it’s a subtle
fast algorithm, probably deep in an inner loop, you should probably be
wrapping it in an abstraction anyway. The code that actually executes the
algorithm should be separate from the business logic, so that programmers
trying to maintain the business logic don’t accidentally break it. And
that’s exactly where it makes the most sense to use <code>unsafe</code> – when
implementing a special algorithm. Maybe the proof that the index is
within bounds relies upon some number theory the compiler was never going
to understand without its own proof engine: great! You should probably
be explaining that in a comment in C++ anyway, and so the conventional
comment that goes with the <code>unsafe</code> block in Rust is a perfect place to
explain it.</p>
<p>But maybe I’m wrong about all of this. Maybe your experience hasn’t
matched mine. Maybe your particular application needs to make unchecked
array accesses a lot, needs them to be unchecked, and needs them littered
all over the codebase. I raise my eyebrows at you, suspect you need more
iterators and perhaps other abstractions, and wonder what problem you’re
trying to solve. But even if you’re absolutely right, I think it’s still
a better idea to write Rust littered with <code>unsafe</code> every time you index
an array, than to write C++.</p>
<p>Because, as I keep emphasizing, Rust is still a better unsafe programming
language than C++. It would be better than C++ even if safety weren’t
a feature.</p>
<h2 id="post-script-some-perspective-for-the-new-rustacean">Post-Script: Some Perspective for the New Rustacean</h2>
<p>I understand where this straw man argument comes from. The word
<code>unsafe</code> is scary, and advice, especially aimed at people coming
from safe languages like Python and Javascript, is to avoid <code>unsafe</code>
features while learning. And while I think adding <code>unsafe</code> to production
code should only be done once you’ve exhausted safe possibilities – which
requires full understanding of safe possibilities – this advice can
feel overbearing for a transitioning C++ programmer, especially when
it is immediately obvious that the safe features are very constrained
and can’t literally do everything.</p>
<p>For that good-faith recovering C++ programmer, new to Rust: You’re
right. The safe subset isn’t enough to do everything you want to
do. And when it doesn’t, that doesn’t mean it failed. Its goal is to
make unsafe code rare, not non-existent. But it might surprise you
how rarely you truly <em>need</em> <code>unsafe</code>. And a good resource for you
might be, as it was for me, the excellent <a href="http://cliffle.com/p/dangerust/">Learn Rust the Dangerous
Way</a> by Cliff L. Biffle.</p>
<p>For what it’s worth, however, this criticism of Rust in general is often
levelled either in bad faith, or from a misunderstanding of what the
<code>unsafe</code> keyword is for. For all the philosophical discussion of what
<code>unsafe</code> truly means – and how it interacts with the surrounding
module and encapsulation/privacy boundaries – as well as principled
conventions for using it, please see the
<a href="https://doc.rust-lang.org/nomicon/">Rustonomicon</a>, the canonical
book on unsafe Rust, the same way <a href="https://doc.rust-lang.org/book/">the book</a>
is canonical for introducing Rust.</p>
<p>Other criticisms of Rust from an HFT or low-latency point of view
are more relevant. Most specifically, <code>gcc</code> and <code>icc</code> are much better
compilers for those use cases – empirically – than is LLVM. Also,
the large codebases existing in C++ are often tested and contain
thousands upon thousands of programmer-years of optimizations and
bugfixes, where even small compiler upgrades are scrutinized closely
for performance regressions. Migrating to another programming language
from that starting point would be prohibitively expensive.</p>
<p>None of which is to say that if Rust gradually replaced C++ altogether,
eventually such ultra-optimizing compilers and ultra-optimized codebases
wouldn’t start appearing in Rust. I hope to see that day within my
lifetime.</p>
In Defense of Async: Function Colors Are Rustyhttps://www.thecodedmessage.com/posts/async-colors/2022-01-03T00:00:00+00:00Finally in 2019, Rust stabilized the async feature, which supports asynchronous operations in a way that doesn’t require multiple operating system threads. This feature was so anticipated and hyped and in demand that there was a website whose sole purpose was to announce its stabilization.
async was controversial from its inception; it’s still controversial today; and in this post I am throwing my own 2 cents into this controversy, in defense of the feature.<p>Finally in 2019, Rust stabilized the <code>async</code> feature, which supports
asynchronous operations in a way that doesn’t require multiple operating
system threads. This feature was so anticipated and hyped and in
demand that there was a <a href="https://areweasyncyet.rs/">website</a> whose sole
purpose was to announce its stabilization.</p>
<p><code>async</code> was controversial from its inception; it’s still controversial
today; and in this post I am throwing my own 2 cents into this
controversy, in defense of the feature. I am only going to try to
counter one particular line of criticism here, and I don’t anticipate
I’ll cover all the nuance of it – this is a multifaceted issue, and
I have a day job. I am also going to assume for this
post that you have some understanding of how <code>async</code> works, but if
you don’t, or just want a refresher I heartily recommend the <a href="https://tokio.rs/tokio/tutorial">Tokio
tutorial</a>.</p>
<h2 id="the-questionable-feature-colored-functions">The Questionable Feature: Colored Functions</h2>
<p>In any discussion of a programming language feature, the first thing
to ask is what problem the feature is trying to solve. In the case of
<code>async</code>, it’s trying to deal with asynchronous operations – operations
that don’t require more work from the CPU to make progress, and where
several might be in flight at any given time. For example, a single process
might be writing some data to a file, reading data from another file,
waiting for new incoming connections, and servicing an existing connection.</p>
<p>So how does Rust solve this?
The easiest way to address this problem would be to have a thread for each
operation, and to let the thread block at the asynchronous operation,
essentially pretending that the operation is a long-running function
like any other the CPU has to do, rather than something taking
place elsewhere. But operating system threads are expensive. And
rather than using green threads as <a href="https://gobyexample.com/goroutines">some other programming languages
do</a>, Rust decided to create a
syntactic sugar for futures, meaning that Rust’s <code>async</code> feature now
suffers from the dreaded function coloring effect first explained by
Bob Nystrom in a Javascript context in 2015.</p>
<p>In Bob Nystrom’s
<a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">now-famous essay</a>
he complains that an analogous feature in Javascript is harmful, because
asynchronous functions – which he refers to as “red” functions – can only
be called from other red functions. Once a red function is needed, the
function that calls it must also be red, and same with the function that
calls that, the whole way up the call chain. And the syntax and semantics
of calling a red function is more complicated than that of calling blue
functions – especially in Javascript, where the next thing to do had to
be enclosed in a lambda, resulting in <a href="http://callbackhell.com/">callback hell</a>
(I do not endorse the suggestions in that post).</p>
<h2 id="colored-is-good-actually-and-rusty">Colored Is Good, Actually, and Rusty</h2>
<p>My position is close to those of <a href="https://morestina.net/blog/1686/rust-async-is-colored">this article</a>,
but with enough nuance that I wanted to write my own blog post to explain
it in more detail. Fundamentally, I agree that Rust does indeed have
colored functions, and that it’s not a bad thing. But I would go further.
I say that function coloring has always existed in Rust, even before it
manifested in the <code>async</code> world, that it is the Rustiest way to solve
this problem, and furthermore, that Rust needs more function coloring than
it has.</p>
<p>Rust, unlike the Javascript of the original colored functions article, is
strongly typed, and influenced heavily by Haskell. This means
that it has lots of type distinction on its values: “colored” values, if
you will.</p>
<p>This type information includes basic ideas of type (string vs number
vs widget), but also shades of distinction that a Javascript programmer
won’t even be aware of. Let’s say you want to take a parameter to your
function, a “widget.” In Javascript, you just take a parameter <code>widget</code>
and do widget things with it, and hope that it works out. The name is
just a comment: it’s up to the caller to know what exactly is expected,
hopefully some sort of widget that works. In Rust, on the other hand, you
have to annotate the parameter with a type, which not only ensures it’s
actually a widget, but distinguishes between these potential requirements:</p>
<ul>
<li>Exclusive reference to a widget <code>&mut Widget</code></li>
<li>Ownership of a widget <code>Widget</code></li>
<li>Reference to a widget that lives forever: <code>&'static Widget</code></li>
<li>Optional widget (in Javascript this is very unclear): <code>Option<Widget></code></li>
</ul>
<p>If <code>Widget</code> is a trait, you have even more options:</p>
<ul>
<li>Owned run-time generic widget: <code>Box<dyn Widget></code></li>
<li>Non-owned reference to compile-time generic widget: <code>&impl Widget</code></li>
</ul>
<p>The list goes on. For each of these options, also, the caller often
has to do something different. If the parameter is optional with <code>Option</code>,
and the caller in fact has a widget, the caller still has to add <code>Some</code>
to the parameter:</p>
<pre tabindex="0"><code>fn foo(widget: &Widget) { ... }
fn foo2(widget: Option<&Widget>) { ... }
fn foo3(widget: Widget) { ... }
let baz = Widget::new();
foo(&baz);
foo2(Some(&baz));
foo3(baz);
</code></pre><p>All of these, in my
<a href="https://en.wikipedia.org/wiki/Synesthesia">synaesthetic</a> mind, are
expressed by different colors and textures on the parameter. For all
of these, Rust has made a value judgment that the programmer should be
explicitly aware of these <em>shades</em> of distinction, if you will (pun
intended). If a parameter is to be optional, the function is called
differently than if it is mandatory. If a borrow happens, that requires a
<code>&</code> from the caller, to make clear to the programmer what is going on,
to make sure the writer of the caller and the writer of the callee are
on the same page. Parameters in Rust are, in general, colored.</p>
<p>And this value coloring, like the async/sync function coloring, propagates. If
a function requires a parameter to be <code>'static</code> in lifetime,
that requirement propagates to the caller of that function to the caller
of that function to the originator of the value in question.</p>
<p>Similarly with return values – I disagree with “More Stina” about
<code>Result</code>. I say <code>Result</code>-returning fallible functions are colored. In
many programming languages, including Javascript, any and all functions
can throw recoverable exceptions. In Rust, functions that might fail
(in a recoverable fashion) must have a different return type than those
that do not – they must return a <code>Result<...></code>. Functions that return
<code>Result<T, E></code> are, as with async functions, harder to call than functions
that just return <code>T</code>. If you don’t want to use the syntactic sugar <code>?</code>
to propagate the error, you have to grapple with <code>Result</code> as a literal
return type, which means unpacking it and doing something else in the
<code>Err</code> case. This is more straightforward than dealing with a raw <code>impl Future</code>, but fundamentally the same concept: either propagate the “color”
with <code>?</code> or <code>async</code>, or else deal with all the implications of <code>Result</code>
or <code>Future</code> on the spot.</p>
<p>And all of these distinctions mean something. Passing by shared reference,
mutable reference, or value are different, and put different safety
requirements on the calling code, safety requirements that allow Rust
to make more safety guarantees than Javascript ever could. Passing
by reference is literally different at the ABI level from by-value,
so each can implement the exact contract as efficiently as possible,
unlike Javascript which leans on an expensive garbage collector for
cleaning up the difference between these notions. That is to say, where
Javascript (and Python) use garbage collectors, Rust uses distinctions –
color distinctions, one might say – between types to achieve the same
result, benefitting in performance but requiring more exactness from
the programmer.</p>
<p>And in Rust, a statically typed programming language, we believe this to
be a good thing. Rust is not for every project – it’s a steeper learning
curve than Python or Javascript, and not every project needs to be
maintainable long-term – but it has a distinct, consistent philosophy,
which says that different things should be treated differently.</p>
<h2 id="async-functions-are-just-different">Async Functions Are Just Different</h2>
<p>A blocking or asynchronous function is not the same thing as a
non-blocking function. A non-blocking function fundamentally does some
CPU tasks, taking control of the processor, using it, and giving it back.
An asynchronous function does the set-up necessary for work to happen
elsewhere. That work doesn’t need control of the CPU, and can be dealt
with through a handle – a future – rather than just waiting for completion.
These are fundamentally different notions, and while it might (or might not)
make sense in Go or Javascript to lump them together into one notion
of “calling a function,” Rust doesn’t do lumping.</p>
<p>When you call a normal function – without <code>async</code>/<code>await</code> – you build up
a stack. When you use <code>async</code>/<code>await</code>, you build up a complex nested
state object. If you use <code>async</code>/<code>await</code> with an executor to spawn a new
task, that complex state object ends up on the heap in a data structure
next to other task objects.</p>
<p>Both “call stack” (for synchronous code) and “task state object” (for
asynchronous code) are reasonable ways of managing memory. Honestly, the
miracle is that Rust, through <code>async</code> and <code>await</code>, manages to make these
two vastly different paradigms look as similar as they do. Having to
annotate the difference is a small price to pay for high-performance
reactive programming.</p>
<p>It’s not 100% perfect. Even with the <code>must_use</code> warnings, people forget
to call <code>await</code> on their futures sometimes. And writing reactive, async
code is harder – which makes sense, because the resulting code is a
more difficult but more performing usage of memory. Writing code that
passes the borrow checker is harder, but considered worth it because
we can remove indirections and avoid garbage collection. <code>async</code> offers
us the same deal for reactive programming.</p>
<h2 id="alternatives-to-async">Alternatives to Async</h2>
<p>But let’s say we did want to remove the coloring here. Let’s say
we did want to pretend that blocking functions were just like CPU-based
ones, but just taking a long time. What would we have to do?</p>
<p>Well, we’d still have to wait for multiple things simultaneously. Our
servers have many connections they have to service at once, and when a
message comes in on socket B, it can’t be ignored just because the code
happens to be on socket A. If asynchronous operations are implemented
by blocking, we have to handle this with multithreading.</p>
<p>Kernel multithreading is expensive, but even Go-style “green threads”
have to have a separate stack for each green thread. Stacks are gnarly,
because it’s unclear how much space should be reserved for them ahead
of time. They have to dynamically adjust to the run-time demand, and
when the original allocation is used up, you get a pause as you try
to allocate more. The advantage is, you have a simpler mental model
with fewer distinctions. Basically, you trade performance for simplicity
– like in garbage collection.</p>
<p>If you want to do this trade, Rust doesn’t stop you from implementing
it yourself. OS threads and blocking system calls are perfectly reasonable
solutions to many problems. But Rust isn’t going to encourage the trade
by creating a new compromise point of “green threads.” You have to do async
the whole way, and if you think of what async code actually de-sugars to,
you wouldn’t complain about how hard async functions are, but be impressed
it’s so darn easy to write them!</p>
<p>Rust is a systems programming language at heart. I understand and respect
that, because of its type system and guarantees, it has found use outside
of the old domains of C and C++, but those C and C++ systems programmers
are Rust’s ideal “base,” in a political sense of the word. Rust should
not sacrifice performance for ease of programmability.</p>
<h2 id="blocking-vs-non-blocking">Blocking vs Non-Blocking</h2>
<p>Rust has two ways of doing off-CPU “IO” operations,
blocking and non-blocking. Blocking takes over the thread, and non-blocking
works through <code>async</code>. This mirrors a distinction in the system calls
that most kernels provide. The operating system API has this distinction
built into it, and it makes sense for Rust to propagate that to the user.</p>
<p>But fundamentally, one of these constructs is more honest than the
other. When we call a blocking kernel system call, rather than the
kernel taking over the CPU, running on it, and then returning the
thread of execution to us, what actually happens internally is
more of a mirage. The kernel deschedules the current process,
and using an internal mechanism more like <code>async</code> than like blocking,
schedules it again, recovering its previous state as if nothing
happens, when the IO is done.</p>
<p>This means that we can pretend the I/O operation was just an operation
like any other, but it comes at a risk – the operation might not
return anytime soon. It might in fact wait for a situation that’s
not going to happen anytime soon.</p>
<p>If such a blocking function is called from a non-async Rust thread,
we assume that the caller is using threads to juggle multiple I/O
events – or else that they simply don’t have anything else going on.
But it is very dangerous to call a blocking function from an <code>async</code>
function. It can starve threads in a thread pool, and cause knock-on
effects in other places. Maybe an async task is waiting for a message
from a channel, and even though the message was sent, the task doesn’t
resume because the thread it’s scheduled on is busy on this blocking
function. The effects are unpredictable and non-local – similar to
the dreaded “undefined behavior” – and debugging is similarly difficult
– ask me how I know!</p>
<p>Functions that block but are not async are referred to in the “More Stina”
<a href="https://morestina.net/blog/1686/rust-async-is-colored">blog post</a> (also
linked above) as “purple functions.” They are not true async “red functions”
that you can call with <code>async</code>, but they are also not safe to simply call
from an async function like a truly CPU-based “blue function” would be.
Calling a blocking function from an async function is extremely unsafe,
and there is simply no warning generated by <code>rustc</code>, normally so helpful
about such things, to let you know how deep and undebuggable a mistake
you’re making.</p>
<p>These purple functions ought to be a different color in Rust, just like
they are in practice. It should be an error to call a blocking system call
from an <code>async</code> function. I don’t know how this would work – I imagine
a generalization of <code>unsafe</code> that includes things like <code>blocks</code>, perhaps
as well as <code>panics</code>. That would fundamentally be an “effects system,”
as is regularly proposed, but that’s not the only solution. But I do
fundamentally think that something ought to be done about this deficit
in Rust’s otherwise quite rigorous function-coloring system.</p>
<p>So, in conclusion, I say: yes, Rust <code>async</code> functions are colored.
This is the same as saying they are strongly typed, and this is a good
thing. And instead of trying to fix it, we should have more of it.</p>
<h2 id="postscript-monads">Postscript: Monads</h2>
<p>As I mentioned before, calling an async function does something
fundamentally different under the hood from calling a vanilla “blue”
function. Similarly, calling a fallible function with <code>Result</code> does
something different from calling a function with a normal return
value. In both cases, the control flow is different – either it
contains short-circuits to error code (<code>Result</code>); or regular
hops back and forth between the task, other tasks, and the
executor (<code>async</code>/<code>Future</code>).</p>
<p>In both these cases, it’s like the meaning of having one statement
come after another has changed: <code>;</code> itself has been overriden.
And it would be nice if generic collections methods, like <code>map</code>
and <code>filter</code>, supported this, so that you could fail, or <code>await</code>,
in the closures.</p>
<p>This is possible in Haskell, because Haskell has a typeclass (equivalent
to Rust traits) for abstracting over different styles of control flow.
That is what Haskell’s infamous monads are for, and why Haskell persists
in using this technology even though it’s so famously confusing for
beginners.</p>
<p>Fundamentally, every Haskell monad is a function color. And often, they
can be stacked together (via “monad transformers”) so that you can say
something like “this function can do IO, fail, and be asynchronous.”
You can also create functions that are polymorphic on “color”: the control
flow is rewritten based on which monad you actually end up in.</p>
<p>Why is this useful? As <a href="https://morestina.net/blog/1686/rust-async-is-colored">“More Stina”’s
post</a> already
mentions, there is a <a href="https://blog.yoshuawuyts.com/fallible-iterator-adapters/">proposal</a> to add <code>try_</code> versions of iterator adapter methods: <code>try_filter</code>, etc.,
to enable them to work smoothly with <code>Result</code>-“colored” functions.
A method like <code>filter</code> or <code>map</code> also would need an adapter to work well with
<code>async</code>. If there were an abstract concept of monad, we could write
code with <code>filter</code>-like methods that could short-circuit on failure and
do the right thing with <code>await</code>:</p>
<pre tabindex="0"><code>vec!([2,3,4])
.iter()
.filter_monad(|x| fallable_thing.contains(x)?)
.filter_monad(|x| network_file_thing.contains(x).await?)
.for_each_monad(|x| network_other_thing.send(x).await);
</code></pre><p>Perhaps Rust will someday gain this abstraction as well. I actually think
that would be good for Rust. Monads are hard to deal with conceptually,
and I’m not sure how to make them more user-accessible, but I think if
anyone could do it, it’s the Rust people, who’ve already done such a
good job so far at programming language design and maintenance.</p>
Endianness, API Design, and Polymorphism in Rusthttps://www.thecodedmessage.com/posts/endian_polymorphism/2021-11-21T00:00:00+00:00I have been working on a serialization project recently that involves endianness (also known as byte order), and it caused me to explore parts of the Rust standard library that deals with endianness, and share my thoughts about how endianness should be represented in a programming language and its standard library, as I think this is also something that Rust does better than C++, and also makes for a good case study to talk about API design and polymorphism in Rust.<p>I have been working on a serialization project recently that involves
endianness (also known as byte order), and it caused me to explore parts
of the Rust standard library that deals with endianness, and share my
thoughts about how endianness should be represented in a programming
language and its standard library, as I think this is also something
that Rust does better than C++, and also makes for a good case study to
talk about API design and polymorphism in Rust.</p>
<p>To start with, let’s discuss endianness a little. I assume most of
my audience has some familiarity with endianness; nevertheless, I’d
like to explain it from first principles. That way, we can subsequently
apply the insights from the explanation to API design. That, and I want
practice explaining concepts, even if they are basic. I’ll try to keep
it interesting, but also feel free to skim the next section.</p>
<h2 id="big-end-little-end">Big End, Little End</h2>
<p>I first encountered the concept of endianness when I was first learning
to program using the <code>DEBUG.EXE</code> program on DOS. When a 16-bit value was
displayed as a 16-bit value, it was just normal hexadecimal, but when
it was displayed as two 8-bit bytes, something weird happened with the
display.</p>
<p>Here’s a <a href="endianness/display_bytes.cpp">C++ snippet</a> that demonstates
the effect:</p>
<pre tabindex="0"><code>template<typename T>
void display_bytes(const T &val) {
char bytes[sizeof(T)];
memcpy(bytes, &val, sizeof(T));
for (auto byte : bytes) {
printf("%2x ", byte);
}
printf("\n");
}
int main() {
int value = 0x12345678;
printf("%x\n", value);
display_bytes(value);
}
</code></pre><p>When run on any little-endian processor (the vast majority of processors),
we get:</p>
<pre tabindex="0"><code>12345678
78 56 34 12
</code></pre><p>The least significant byte is first, so if you print out the individual
bytes in order, you have to read it backwards – though each individual
byte is still forwards.</p>
<p>If you were to run the same code on a big-endian processor, you would get:</p>
<pre tabindex="0"><code>12345678
12 34 56 78
</code></pre><p>At this point, little endianness as I understood it was a weird thing
that Intel processors did for reasons I didn’t understand, that made me
do a little extra work when reading hex dumps. At the time, this was
fine: I thought that having to apply this extra arcane knowledge was
cool for its own sake. But also, I thought of little endianness as the
weird way to do things that required extra work, and big endianness as
the more natural design. It wasn’t until much later that I got some
nuance on that opinion.</p>
<p>See, the writing system we use for numbers is big endian. Instead of
dividing a number into bytes (base 256), we divide it into digits.
We consider the left of the page to come before the right of the page,
and we write the most significant digit first. This is all taught
explicitly in grade school:</p>
<pre tabindex="0"><code>1234 = 1*10^3 + 2*10^2 + 3*10^1 + 4*10^0
</code></pre><p>There’s a certain, very human logic to this system: the more important
information comes first, then the details. Mathematically, though,
we see decreasing numbers for the powers of 10: first we specify
a factor for 10 to the <em>third</em>, then to the <em>second</em>, then to the
<em>first</em>, then to the <em>zeroth</em>. We could instead imagine where
we wrote our decimal numbers big endian, where the same
number would be written <code>4321</code>, and still mean “one thousand
two hundred thirty-four,” where we’d count like this:</p>
<pre tabindex="0"><code>0
1
2
3
...
9
01
11
21
...
91
02
12
...
99
001
</code></pre><p>This would have the advantage that the first digit, digit zero,
would be multiplied by <code>10^0</code>, digit one by <code>10^1</code>. Not what
humans would normally decide to do, but it has a certain logic.
And if you think about languages like Hebrew or Arabic, which
are written right from left, but which write numbers the same
direction we do, the least significant digit is actually reached
first in the normal direction of reading: when they see “100”
in the midst of the text, the zeroes are “before” the “1”. (I am
told that this is not how most people think of it; that they instead
just think of numbers as going the other way from other text, but
it just goes to demonstrate how based on convention all of this
stuff is).</p>
<p>So all of this is to say that, the weird effect we had before
with big endian looking “normal” and little endian looking “weird”
has nothing to do with the intrinsic logic of big vs little endian,
but rather with the fact that we’re mixing a little endian processor
with a big endian writing notation. If we
<a href="https://www.thecodedmessage.com/endianness/backwards_hex.cpp">instead</a> were to print
the digits of each number in increasing significance – that is,
if we were to use little endian as our printing convention – we’d get:</p>
<pre tabindex="0"><code># Little Endian Machine
87654321
87 65 43 21
# Big Endian Machine
87654321
21 43 65 87
</code></pre><p>The mismatch between writing the whole word in hex and writing the
individual bytes in order, in hex is caused by a mismatch between
the endianness of the system (normally little in practice) and
the endianness of the writing system (normally big in practice).
When the writing system isn’t a factor, little endian makes more
mathematical sense, is easier to reason about in circuitry and
code, and therefore has won out over big endian in every major processor
architecture.</p>
<p>The only real exception is network byte order, which uses big endian.
This is convenient for manually reading hex dumps of packets, but
probably has more to do with the fact that the Internet developed when
this question was much less settled. Due to the presence of network byte
order, however, and the fact that the endianness of Intel and modern
ARM is opposite of the endianness of most human writing systems, the
concept remains with us.</p>
<h2 id="when-is-endianness-relevant">When is endianness relevant?</h2>
<p>In writing numbers, a digit has no endianness: <code>8</code> means the same thing as
a single digit number. Similarly, a byte is indivisible in a processor.
Bytes are made up of bits, but outside of special instructions, the
ordering of those bits is not relevant. One of them is most significant,
one of them is least, but unless we’re indexing them for a special
instruction, or sending them over a wire one by one, there is no way to
say which such bit comes “first.”</p>
<p>Indeed, if we want to display a byte as a series of bits,
we as the programmer get to choose the endianness, and the
<a href="https://www.thecodedmessage.com/endianness/bits.rs">program</a> runs identically on either a big endian
or a little endian platform. The little endian version is a little
more intuitive, as “2 to the N” is an operation that’s easy to write
on computers, and in little endian the N increases as the index increases:</p>
<pre tabindex="0"><code>fn byte_as_bits_le(byte: u8) -> [u8; 8] {
let mut res = [0u8; 8];
for i in 0..8 {
let mask = 1 << i;
if byte & mask == 0 {
res[i] = 0;
} else {
res[i] = 1;
}
}
res
}
</code></pre><p>Nevertheless, at no point is this function relying on the endianness
of the hardware, and it does the same thing on either types of hardware.</p>
<p>Why do I bring this up? Well, I don’t think it makes any sense to speak
of the endianness of a (multi-byte) word <em>per se</em>. The endianness of the
word only comes into play when it is stored as – and accessible as –
a series of bytes.</p>
<p>So from that point of view, what are the operations where endianness is
relevant? Given a word, what series of bytes comprises it? And then,
given a series of bytes, what word is it?</p>
<p>In Rust terms, these are <code>to_be_bytes</code> (for big endian)/<code>to_le_bytes</code> (for
little) in the one direction, and <code>from_be_bytes</code>/<code>from_le_bytes</code> in the
other. These methods are all bundled together in the Rust
<a href="https://doc.rust-lang.org/std/primitive.u32.html#method.to_be_bytes">documentation</a>
for – in this case – the primitive <code>u32</code> type, along with <code>ne</code> which
gives whatever the native endianness of the processor is.</p>
<p>These are the APIs I’m going to be discussing. But before discussing
how they might be improved, I’m going to point out an API that I think
doesn’t make as much sense:
<a href="https://doc.rust-lang.org/std/primitive.u32.html#method.to_be"><code>to_be</code></a>. This method takes in a word, a <code>u32</code>, and outputs
a <code>u32</code>, and yet claims to change the endianness of that word, which as
I mentioned, does not have endianness per se, only in that it’s represented
by bytes.</p>
<p>I know what they mean by it. On a little endian platform, it will replace
<code>0x12345678</code> with <code>0x78563412</code>. But what does that actually mean? In
its form as a <code>u32</code>, as I have argued above, a number has no endianness.
So what is this number <code>0x78563412</code>? It is the number that, if stored
in bytes in the native endianness, will store the original number in big
endian.</p>
<p>That’s a mouthful, I know, because it’s actually a complicated concept.
That is to say, it’s a hack. We want to write a number – say, <code>2000</code> –
in big endian, but we don’t want to think of it as bytes, yet. We want to
be able to load the whole number into a register, and when we write it,
we want it to be <code>2000</code> in big endian. So we byte swap the number, and
instead of storing <code>2000</code>, we store <code>3490119680</code>, so that if we
write it using the processor’s normal mechanism for writing, it comes out
to 2000 in big endian.</p>
<p>Basically, <code>to_be</code> does the equivalent of <code>u32::from_le_bytes(input.to_be_bytes())</code>, and using it looks like this:</p>
<pre tabindex="0"><code>let input: u32 = 2000;
// These two invocations do the same thing
let be = input.to_be();
let be2: u32 = u32::from_le_bytes(input.to_be_bytes());
println!("{} {}", be, be2); // 3490119680 for both
// The result can be written using native (little endian) byte
// order, and it will give 2000 in big endian byte order.
assert_eq!(be.to_le_bytes(), input.to_be_bytes());
</code></pre><p>This is arguably a useful hack – though I’m not fully convinced – but
it is definitely a hack. I do not think the description is sufficiently
rigorous. The output of <code>to_be</code> is not a number “in big endian,” it
is a different number that resembles the big endian representation of
the original number. The description is a simplification, and I think
a conceptually incoherent one – which is understandable because the
concept at play here is so hackish.</p>
<p>It appears that <code>to_be</code> was in Rust 1.0, and <code>to_be_bytes</code> was introduced
later. This to me is a good sign, as <code>to_be_bytes</code>, I think, makes
much more sense as an interface. And as to why we started out with the
<code>to_be</code> type of interface in Rust, that makes sense as well, because in
C the traditional (POSIX but not ANSI C) functions for these conversions
have similar semantics, such as <code>htonl</code> (host to network long), where
we have this conceit of storing a “big endian” or “network byte order”
value in a <code>uint32_t</code> (C for <code>u32</code>). This always struck me as the wrong
abstraction, but it is justified – or at least more understandable –
for C as we simply can’t pass around things like <code>char[4]</code> (C for <code>[u8; 4]</code>)
by value in C.</p>
<p>There are other technical and historical reasons why <code>htonl</code> and <code>to_be</code>
and friends exist, even if conceptually messy, but in any case, since I’m
talking about API design, and <code>to_be_bytes</code> and friends are a better match for
the concepts at hand, I am now going to pretend <code>to_be</code> is deprecated
(it is not), and move on to discussing the design of <code>to_be_bytes</code> and
<code>to_le_bytes</code>.</p>
<h2 id="policies">Policies</h2>
<p>So the first thing I notice is that there’s six methods that deal
with fundamentally one topic:</p>
<pre tabindex="0"><code>from_be_bytes
from_le_bytes
from_ne_bytes
to_be_bytes
to_le_bytes
to_ne_bytes
</code></pre><p>But really they vary in two ways, namely:</p>
<ul>
<li>which operation is performed (<code>from_X_bytes</code> vs <code>to_X_bytes</code>)</li>
<li>which endianness is required (<code>le</code>, <code>be</code>, and <code>ne</code>)</li>
</ul>
<p>For us humans, this is clear from the names, but to the compiler,
these names do not form a pattern that it is capable of recognizing.
There are simply 6 separate functions named with 6 separate combinations
of characters.</p>
<p>Now, having separate functions for separate operations makes sense;
that’s what functions are for. But for the same operation but with
different endianness, it might make more sense to indicate that
it is one operation with several possible endiannesses by making
the endianness into a parameter.</p>
<p>The obvious way to do this would be via run-time parameter. A fairly
literal translation of this API would be something like:</p>
<pre tabindex="0"><code>enum Endian {
Little,
Big,
Native,
}
impl u32 {
fn to_endian_bytes(self, endianness: Endian) -> [u8; 4] {
match endianness {
Endian::Little | Endian::Native => { ... }
Endian::Big => { ... }
}
}
fn from_endian_bytes([u8: 4], endianness: Endian) -> Self {
match endianness {
Endian::Little | Endian::Native => { ... }
Endian::Big => { ... }
}
}
}
</code></pre><p>This would also allow us to implement the concept of “native” byte
order a little differently, and create more names for byte orders:</p>
<pre tabindex="0"><code>enum Endian {
Little,
Big,
}
static NATIVE_ENDIAN: Endian = Endian::Little;
static NETWORK_ENDIAN: Endian = Endian::Big;
</code></pre><p>So, besides simplifying away the need for a separate implementation
for the <code>ne</code> functions, and making the code more in sync with what’s
happening, what other positive things have we accomplished? Well,
given that we now have a parameter, we can now make more complicated
code parametric on it. Imagine we have an entire structure to write out,
and we want to write the entire structure as big-endian or little-endian,
perhaps because the protocol in question changed endianness at some version.
Or perhaps we just want to make clear to the reader that one endianness
is used for the entire structure. We can now do something like this:</p>
<pre tabindex="0"><code>struct Structure {
a: u32,
b: u32,
c: u32,
}
impl Structure {
fn serialize(&self, endianness: Endian) -> [u8; 12] {
let mut res = [0u8; 12];
[0..4].copy_from_slice(self.a.to_endian_bytes(endianness));
[4..8].copy_from_slice(self.b.to_endian_bytes(endianness));
[8..12].copy_from_slice(self.c.to_endian_bytes(endianness));
res
}
pub fn serialize_old_version(&self) -> [u8; 12] {
self.serialize(Endian::Big)
}
pub fn serialize_new_version(&self) -> [u8; 12] {
self.serialize(Endian::Little)
}
}
</code></pre><p>The alternative would be to write two separate serializers, and duplicate
all the logic of how to arrange the layout. Duplication is bad, because
bug fixes don’t necessarily get to all the duplicate copies. So, to
save on duplication, we’d have to basically wrap <code>to_be_bytes</code> and
<code>to_le_bytes</code> in a version of this; it would be more convenient
if the standard library had done this for us.</p>
<p>What is the downside of this? Well, the implementation didn’t really
get any simpler. Actually, in the normal case, where you don’t change
your mind about endianness, the implementation got more complicated.
We now have a <code>match</code> expression in our two simplified functions,
which theoretically indicates a run-time decision. We could trust
the optimizer to fold the decision in through inlining and
constant-propagation, but trusting the optimizer is suspicious
and unnecessary.</p>
<p>Nothing we’ve done so far requires this decision to be made at
run-time, and so we can instead make the decision at compile-time.
Where we had a run-time parameter, we can now have a compile-time
parameter.</p>
<p>Now, although Rust has rudimentary support for other kinds of
compile-time parameters, the archetypical compile-time parameter
is a type, bound by a trait. Our <code>enum</code> from before would then
have to be lifted into the type space, as a <code>trait</code> and a few
types:</p>
<pre tabindex="0"><code>trait Endianness { }
struct BigEndian;
impl Endianness for BigEndian { }
struct LittleEndian;
impl Endianness for LittleEndian { }
</code></pre><p>Now, we need a compile-time equivalent for <code>match</code>. This is a
little harder, as at the time of this writing stable Rust does
not have the most direct equivalent of <code>match</code> for implementors
of <code>trait</code>s, that is, “specialization.” But Rust does allow something
similar: the code for each branch of the <code>match</code> must go in each type’s
implementation of that trait, and the fact of the <code>match</code> must be
provided in the trait itself.</p>
<p>This will also help us simplify the implementation. This <code>to_be_bytes</code>/
<code>to_le_bytes</code> API is not just implemented for <code>u32</code>, but for all primitive
types. Currently, these mostly-similar implementations are stamped out
by a macro, along with other methods for primitive types. But we might
imagine that there are two things going on in the implementation:</p>
<ul>
<li>write out the type into an array of bytes</li>
<li>either swap the bytes, or not, based on whether we’re using the hardware
endianness</li>
</ul>
<p>We could then make the trait come into play – with the decision made at
compile time – for the swapping part.</p>
<pre tabindex="0"><code>trait Endianness {
fn possibly_swap(bytes: &mut [u8]);
}
struct BigEndian;
impl Endianness for BigEndian {
fn possibly_swap(bytes: &mut [u8]) {
// actually swap here
}
}
impl Endianness for LittleEndian {
fn possibly_swap(_: &mut [u8]) {
// no need to do anything here
}
}
</code></pre><p>We have now moved some of the implementation into a trait,
where the specifics of the implementation are determined by which type
implements that trait. This is an example of the policy pattern, where
a portion of the code is abstracted out into a policy, and the policy
and the main body of the function are sewn together – in this case,
at compile-time – into many variations of a function that execute
similarly to what an implementor might have written by hand.</p>
<p>Note that there is no possibility of doing run-time endianness
determination in this version. This trait methods does not take a
<code>self</code> parameter, and would have to be invoked as <code>T::possibly_swap</code>.
This is possible in Rust because we are doing compile-time polymorphism,
not run-time, so there is no need to make this trait object-safe.</p>
<p>Our previous example serializer, with the two versions, now looks
something like this:</p>
<pre tabindex="0"><code>struct Structure {
a: u32,
b: u32,
c: u32,
}
impl Structure {
fn serialize<T: Endianness>(&self) -> [u8; 12] {
let res = [0u8; 12];
(&mut res[0..4]).copy_from_slice(self.a.to_endian_bytes::<T>());
(&mut res[4..8]).copy_from_slice(self.b.to_endian_bytes::<T>());
(&mut res[8..12]).copy_from_slice(self.c.to_endian_bytes::<T>());
}
pub fn serialize_old_version(&self) -> [u8; 12] {
self.serialize::<BigEndian>()
}
pub fn serialize_new_version(&self) -> [u8; 12] {
self.serialize::<LittleEndian>()
}
}
</code></pre><p>The policy pattern is a fairly common pattern in generic programming
just like in object-oriented programming, but when generic programming
is implemented through monomorphization, as it is in Rust, it can be
just as efficient as hand-implementing the combinations of policy
and code, while allowing for more policies.</p>
<p>For example, if there were a platform where 4-byte chunks were split
into 2-byte chunks little endian, but 2-byte chunks were split into
1-byte chunks big endian, we could write a new policy for this platform
and all the existing code would support it.</p>
<p>A much more complicated example of the policy pattern is <code>serde</code>,
where the generated serializers and deserializers for each structure
are all polymorphic on what serialization format should be used. If a
new serialization format comes out with <code>serde</code> support, all existing
<code>Serialize</code> instances can then be used with the new format without
modification.</p>
<p>Now, in practice, there are often processor instructions that do byte
swaps. The hardware uses an interface analogous to the hackish, conceptually
messy <code>to_be()</code>, which at a hardware level makes sense because elegance of
abstraction is not as an important goal as performance. This converts
<code>0x12345678</code> into <code>0x78563412</code>, and similar. So, this implementation
is not actually what the policy would look like in a production context.
Nevertheless, the endianness argument could definitely be passed in by
a trait-constrained type parameter; the implementation would just be
more complicated.</p>
<h2 id="traits">Traits</h2>
<p>I mentioned before that <code>u32</code> is not the only type that implements this
set of methods, this convention, this informal protocol of <code>to_be_bytes</code>,
<code>to_le_bytes</code>, etc. This means that if we were writing in C++, we
would have enough from this informal protocol to write a function that
did something like “write this value in big endian twice, and little
endian twice, to different locations” that was agnostic to the type
provided, as long as it implemented this informal interface. It would
look something like this:</p>
<pre tabindex="0"><code>template <typename T>
void write_four_times(T val) {
write_to_location_1(val.to_be_bytes());
write_to_location_2(val.to_be_bytes());
write_to_location_3(val.to_le_bytes());
write_to_location_4(val.to_le_bytes());
}
</code></pre><p>This would allow you to call <code>write_four_times</code> on any type for
which that code made sense, as C++ templates are literally templates,
and the <code>T</code> is filled in before type-checking. The protocol here
is implicit in the structure of the function – it is compile-time
<a href="https://en.wikipedia.org/wiki/Duck_typing#Templates_or_generic_types">duck typing</a>.</p>
<p>Rust generic functions are type-checked before monomorphization, so
we can’t do this in Rust. Instead of defining <code>to_le_bytes()</code> and
friends separately on each type, this function would require them
to be in a trait, maybe <code>EndianBytes</code>:</p>
<pre tabindex="0"><code>fn write_four_times<T: EndianBytes>(val: T) {
write_to_location_1(&val.to_be_bytes());
write_to_location_2(&val.to_be_bytes());
write_to_location_3(&val.to_le_bytes());
write_to_location_4(&val.to_le_bytes());
}
</code></pre><p><code>EndianBytes</code> would have to define at least those methods:</p>
<pre tabindex="0"><code>trait EndianBytes {
fn to_be_bytes(self) -> [u8; ???];
fn to_le_bytes(self) -> [u8; ???];
}
</code></pre><p>Unfortunately, as the <code>???</code> shows, the different output arrays have
different lengths – a <code>u16</code> would be 2 bytes and a <code>u64</code> 8 bytes –
and so the Rust trait system at the time of this writing is (to my
knowledge) not powerful enough to represent this trait as is. Instead,
it would have to return a slice, which introduces an additional run-time
value (the length) into the mix that we’d rather avoid in this
exercise on compile-time generic programming.</p>
<h2 id="run-time-endianness">Run-Time Endianness</h2>
<p>What if we want to make decisions about endianness at
run-time, say, because we are implementing DBus? This is,
as Linus Torvalds pointed out in one of his famously angry
<a href="https://lkml.org/lkml/2015/4/22/628">emails</a>, a stupid idea for
a protocol, but we don’t always get to choose what protocol we
implement. Even though choosing one endianness and sticking to it would
have avoided the run-time cost of making a decision (which as Torvalds
points out is more than the cost of either decision), the developers
of DBus did not do that. UTF-16 also didn’t – it also does run-time
endianness adjustment with a sentinal character at the top of the text
block to indicate the endianness.</p>
<p>The most obvious solution is to use the run-time parameterized
version we discussed towards the beginning of this post, and
have an <code>enum Endianness</code> parameter. This would be parsed in each
message (or connection, or whatever duration of time endianness
is configured) and then passed through to all the serializing and
deserializing code, which would look something like our
original serialization example:</p>
<pre tabindex="0"><code>fn serialize(&self, endianness: Endian) -> [u8; 12] {
let res = [0u8; 12];
(&mut res[0..4]).copy_from_slice(self.a.to_endian_bytes(endian));
(&mut res[4..8]).copy_from_slice(self.b.to_endian_bytes(endian));
(&mut res[8..12]).copy_from_slice(self.c.to_endian_bytes(endian));
}
pub fn serialize_old_version(&self) -> [u8; 12] {
self.serialize(Endian::Big)
}
pub fn serialize_new_version(&self) -> [u8; 12] {
self.serialize(Endian::Little)
}
</code></pre><p>We can do better than that, though. This has one copy of the serialization
code in the source, and one copy in the binary. What we could do instead,
is expand the more sophisticated compile-time version of the serialization
code, and move the <code>match</code> into a wrapper serialize method:</p>
<pre tabindex="0"><code>fn serialize_impl<T: EndiannessTrait>(&self) -> [u8; 12] {
let res = [0u8; 12];
(&mut res[0..4]).copy_from_slice(self.a.to_endian_bytes::<T>());
(&mut res[4..8]).copy_from_slice(self.b.to_endian_bytes::<T>());
(&mut res[8..12]).copy_from_slice(self.c.to_endian_bytes::<T>());
}
pub fn serialize(&self, endianness: EndiannessEnum) -> [u8; 12] {
match endianness {
EndiannessEnum::Big => self.serialize_impl::<BigEndian>(),
EndiannessEnum::Little => self.serialize_impl::<LittleEndian>(),
}
}
</code></pre><p>This generates two serializers from one serializer function (thus
mitigating the biggest problem with code duplication – that of
maintainability), and makes the run-time decision further up
in the call tree. This ability – to adjust between finer-grained
run-time decisions and duplication of run-time code – is one of
the greatest powers of C++ and of Rust. We can effectively – in the DBus
case – create two entire DBus deserializers – one for little-endian,
one for big endian – and then decide between the two deserializers
at run-time on a per-message basis, which, because fewer run-time
decisions are being made, will be much more efficient than making the
run-time deserialization decision at every deserialization site.</p>
<p>Of course, for serialization we can simply write one serializer and
always generate little-endian DBus messages.</p>
C++ Move Semantics Considered Harmful (Rust is better)https://www.thecodedmessage.com/posts/cpp-move/2021-11-03T00:00:00+00:00This post is part of my series comparing C++ to Rust, which I introduced with a discussion of C++ and Rust syntax. In this post, I discuss move semantics. This post is framed around the way moves are implemented in C++, and the fundamental problem with that implementation, With that context, I shall then explain how Rust implements the same feature. I know that move semantics in Rust are often confusing to new Rustaceans – though not as confusing as move semantics in C++ – and I think an exploration of how move semantics work in C++ can be helpful in understanding why Rust is designed the way it is, and why Rust is a better alternative to C++.<p>This post is part of my <a href="https://www.thecodedmessage.com/tags/rust-vs-c++/">series</a>
comparing C++ to Rust, which I introduced
with a <a href="https://www.thecodedmessage.com/posts/hello-rust/">discussion of C++ and Rust syntax</a>. In
this post, I discuss move semantics. This post is framed around the
way moves are implemented in C++, and the fundamental problem with
that implementation, With that context, I shall then explain how Rust
implements the same feature. I know that move semantics in Rust are often
confusing to new Rustaceans – though not as confusing as move semantics
in C++ – and I think an exploration of how move semantics work in C++
can be helpful in understanding why Rust is designed the way it is,
and why Rust is a better alternative to C++.</p>
<p>I am by far not the first person to discuss this topic, but I intend:</p>
<ul>
<li>to discuss it thoroughly enough to contribute to the conversation</li>
<li>to nevertheless discuss it in such a way that those
familiar with systems programming, but unfamiliar with either C++
or move semantics, can understand it, starting from first principles</li>
</ul>
<h2 id="modern-c">Modern C++</h2>
<p>First, some background.</p>
<p>In 2011, C++ finally fixed a set of long-standing deficits in the
programming language with the shiny new C++11 standard, bringing it into
the modern era. Programmers enthusiastically pushed their companies to
allow them to migrate their codebases, champing at the bit to be able to
use these new features. Writers to this day talk about “modern C++,”
with the cut-off being 2011. Programmers who only used C++ pre-C++11 are
told that it is a new programming language, the best version of its old
self, worth a complete fresh try.</p>
<p>There were a lot of new features to be excited about. C++ standard
threads were added then – and thread standardization was indeed good,
though anyone who wanted to use threads before likely had their choice
of good libraries for their platform. Closures were also very exciting,
especially for people like me who came from functional programming, but
to be honest, closures were just syntactic sugar for existing patterns
of boilerplate that could be readily used to write function objects.</p>
<p>Indeed, the real excitement at the time, certainly the one my colleagues
and I were most excited about, was move semantics. To explain why this
feature was so important, I’ll need to talk a little about the C++
object model, and the problem that move semantics exist to solve.</p>
<h2 id="value-semantics">Value Semantics</h2>
<p>Let’s start by talking about a primitive type in C++: <code>int</code>. Objects –
in C++ standard parlance, <code>int</code> values are indeed considered objects –
of type <code>int</code> only take up a few bytes of storage, and so copying them
has always been very cheap. When you assign an <code>int</code> from one variable
to another, it is copied. When you pass it to a function, it is copied:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">int</span> <span style="color:#a6e22e">print_i</span>(<span style="color:#66d9ef">int</span> arg) {
</span></span><span style="display:flex;"><span> arg <span style="color:#f92672">+=</span> <span style="color:#ae81ff">3</span>;
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> arg <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl;
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> foo <span style="color:#f92672">=</span> <span style="color:#ae81ff">3</span>;
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">int</span> bar <span style="color:#f92672">=</span> foo; <span style="color:#75715e">// copy
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo <span style="color:#f92672">+=</span> <span style="color:#ae81ff">1</span>; <span style="color:#75715e">// foo gets 4
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> bar <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl; <span style="color:#75715e">// bar is still 3
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>print_i(foo); <span style="color:#75715e">// prints 4+3 ==> 7
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> foo <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl; <span style="color:#75715e">// foo is still 4
</span></span></span></code></pre></div><p>As you can see, every variable of type <code>int</code> acts independently of each
other when mutated, which is how primitive types like <code>int</code> work in many
programming languages.</p>
<p>In the C++ version of object-oriented programming, it was decided that
values of custom, user-defined types would have the same semantics, that
they would work the same way as the primitive types. So for C++ strings:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>string foo <span style="color:#f92672">=</span> <span style="color:#e6db74">"foo"</span>;
</span></span><span style="display:flex;"><span>std<span style="color:#f92672">::</span>string bar <span style="color:#f92672">=</span> foo; <span style="color:#75715e">// copy (!)
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo <span style="color:#f92672">+=</span> <span style="color:#e6db74">"__"</span>;
</span></span><span style="display:flex;"><span>bar <span style="color:#f92672">+=</span> <span style="color:#e6db74">"!!"</span>;
</span></span><span style="display:flex;"><span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> foo <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl; <span style="color:#75715e">// foo is "foo__"
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>std<span style="color:#f92672">::</span>cout <span style="color:#f92672"><<</span> bar <span style="color:#f92672"><<</span> std<span style="color:#f92672">::</span>endl; <span style="color:#75715e">// bar is "foo!!"
</span></span></span></code></pre></div><p>This means that whenever we assign a string to a new variable, or
pass it to a function, a copy is made. This is important, because
the <code>std::string</code> object proper is just a handle, a small structure
that manages a larger memory allocation on the heap, where the
actual string data is stored. Each new <code>std::string</code> that is made
via copy requires allocating a new heap allocation, a relatively
expensive operation in performance.</p>
<p>This would cause a problem when we want to pass a <code>std::string</code> to a
function, just like an <code>int</code>, but don’t want to actually make a copy
of it. But C++ has a feature that helps with that: <code>const</code> references.
Details of the C++ reference system are a topic for another post, but
<code>const</code> references allow a function to operate on the <code>std::string</code>
without the need for a copy, but still promising not to change the
original value.</p>
<p>The feature is available for both <code>int</code> and <code>std::string</code>; the principle
that they’re treated the same is preserved. But for the sake of performance,
<code>int</code>s are passed by value, and <code>std::string</code>s are passed by <code>const</code>
reference in the same situation. In practice, this dilutes the benefit
of treating them the same, as in practice the function signatures
are different if we don’t want to trigger spurious expensive deep copies:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">foo</span>(<span style="color:#66d9ef">int</span> bar);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">foo</span>(<span style="color:#66d9ef">const</span> std<span style="color:#f92672">::</span>string <span style="color:#f92672">&</span>bar);
</span></span></code></pre></div><p>If you instead declare the function <code>foo</code> like you would with an <code>int</code>,
you get a poorly performing deep copy. The default is something you
probably don’t want:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">foo</span>(std<span style="color:#f92672">::</span>string bar);
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">void</span> <span style="color:#a6e22e">foo2</span>(<span style="color:#66d9ef">const</span> std<span style="color:#f92672">::</span>string <span style="color:#f92672">&</span>bar);
</span></span><span style="display:flex;"><span><span style="color:#960050;background-color:#1e0010">`</span>
</span></span><span style="display:flex;"><span>std<span style="color:#f92672">::</span>string bar(<span style="color:#e6db74">"Hi"</span>); <span style="color:#75715e">// Make one heap allocation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo(bar); <span style="color:#75715e">// Make another heap allocation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo2(bar); <span style="color:#75715e">// No copy is made
</span></span></span></code></pre></div><p>This is all part of “pre-modern” C++, but already we’re seeing negative
consequences of the decision to treat <code>int</code> and <code>std::string</code> as identical
when they are not, a decision that will get more gnarly when applied to
moves. This is why Rust has the <code>Copy</code> trait to mark types like <code>i32</code>
(the Rust equivalent of <code>int</code>) as being copyable, so that they can be
passed around freely, while requiring an explicit call to <code>clone()</code>
for types like <code>String</code> so we know we’re paying the cost of a deep copy,
or else an explicit indication that we’re passing by reference:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">foo</span>(bar: String) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Implementation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">foo2</span>(bar: <span style="color:#66d9ef">&</span><span style="color:#66d9ef">str</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Implementation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> bar <span style="color:#f92672">=</span> <span style="color:#e6db74">"hi"</span>.to_string();
</span></span><span style="display:flex;"><span>foo(bar.clone());
</span></span><span style="display:flex;"><span>foo2(<span style="color:#f92672">&</span>bar);
</span></span></code></pre></div><p>The third option in Rust is to move, but we’ll discuss that after
we discuss moves in C++.</p>
<h2 id="copy-deletes-and-moves">Copy-Deletes and Moves</h2>
<p>C++ value semantics break down even more when we do need the
function to hold onto the value. References are only valid as long
as the original value is valid, and sometimes a function needs it
to stay alive longer. Taking by reference is not an option when
the object (whether <code>int</code> or <code>std::string</code>) is being added to a vector
that will outlive the original object:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span><span style="color:#66d9ef">int</span><span style="color:#f92672">></span> vi;
</span></span><span style="display:flex;"><span>std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span>std<span style="color:#f92672">::</span>string<span style="color:#f92672">></span> vs;
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> foo <span style="color:#f92672">=</span> <span style="color:#ae81ff">3</span>;
</span></span><span style="display:flex;"><span> foo <span style="color:#f92672">+=</span> <span style="color:#ae81ff">4</span>;
</span></span><span style="display:flex;"><span> vi.push_back(foo);
</span></span><span style="display:flex;"><span>} <span style="color:#75715e">// foo goes out of scope, vi lives on
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>{
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>string bar <span style="color:#f92672">=</span> <span style="color:#e6db74">"Hi!"</span>;
</span></span><span style="display:flex;"><span> bar <span style="color:#f92672">+=</span> <span style="color:#e6db74">" Joe!"</span>;
</span></span><span style="display:flex;"><span> vs.push_back(bar);
</span></span><span style="display:flex;"><span>} <span style="color:#75715e">// bar goes out of scope, vs lives on
</span></span></span></code></pre></div><p>So, to add this string to the vector, we must first make an
allocation corresponding to the object contained in the variable
<code>bar</code>, and then must make a new allocation for the object that lives in
<code>vs</code>, and then copy all the data.</p>
<p>Then, when <code>bar</code> goes out of scope, its destructor is called, as is
done automatically whenever an object with a destructor goes out
of scope. This allows <code>std::string</code> to free its heap allocation.</p>
<p>Which means we copied an allocation into a new heap allocation, just to free
the original allocation. Copying an allocation and freeing the old one
is equivalent to just re-using the old allocation, just slower. Wouldn’t
it make more sense to make the string in the vector just refer to the
same heap allocation that <code>bar</code> formerly did?</p>
<p>Such an operation is referred to as a “move,” and the original C++ –
pre C++11 – didn’t support them. This was possibly because they didn’t
make sense for <code>int</code>s, and so they were not added for objects that were
trying to act like <code>int</code>s – but on the other hand, destructors were
supported and <code>int</code>s don’t need to be destructed.</p>
<p>In any case, moves were not supported. And so, objects that managed
resources – in this case, a heap allocation, but other resources could
apply as well – could not be put onto vectors or stored in collections
directly without a copy and delete of whatever resource was being managed.</p>
<p>Now, there were ways to handle this in pre-C++11 days. You could add
an indirection, and make a heap allocation to contain the <code>std::string</code>
object, which is only a small object with a pointer to another allocation,
but would at least let you pass around a <code>std::string *</code> which is a
raw pointer that would not trigger all these copies by automatically
managing the heap allocation with this façade of value semantics. Or
you could manually manage a C-style string with <code>char *</code>.</p>
<p>But the most ergonomic, clear <code>std::vector<std::string></code> could not be
used without performance degradation. Worse, if the vector ever needed
to be resized, and had to itself switch to a different allocation, it
would have to copy all those <code>std::string</code> objects internally and
delete the originals, N useless reallocations.</p>
<p>As a demonstration of this, I wrote a <a href="https://www.thecodedmessage.com/string_example/string.cpp">sample
program</a> with a vastly simplified
version of <code>std::string</code>, that tracks how many allocations it makes.
It allows C++11-style moves to be enabled or disabled, and then it
takes all the command line arguments, creates <code>string</code> objects out
of them, and puts them in a vector. For 8 command line arguments,
the version with move made, as you might expect, 8 allocations,
whereas the version without the move, that just put these strings
into a vector, made 23. Each time a string was added to a vector,
a spurious allocation was made, and then N spurious allocations
had to be made each time the vector doubled.</p>
<p>This problem is purely an artifact of the limitations of the tools
provided by C++ to encapsulate and automatically manage memory, RAII and
“value semantics.”</p>
<p>Consider this snippet of code:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#75715e">// Pre-C++11, without moves
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span>std<span style="color:#f92672">::</span>string<span style="color:#f92672">></span> vec;
</span></span><span style="display:flex;"><span>{ <span style="color:#75715e">// This might take place inside another function
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// Using local block scope for simplicity
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> std<span style="color:#f92672">::</span>string foo <span style="color:#f92672">=</span> <span style="color:#e6db74">"Hi!"</span>;
</span></span><span style="display:flex;"><span> vec.push_back(foo);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>string bar <span style="color:#f92672">=</span> <span style="color:#e6db74">"Hello!"</span>;
</span></span><span style="display:flex;"><span> vec.push_back(bar);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Use the vector
</span></span></span></code></pre></div><p>If we didn’t use this <code>string</code> class, we would then
have not done a copy, just to free the original allocation. We would
have simply put the pointer into the vector. We would then have been
responsible for freeing all the allocations – once – when we’re done:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#75715e">// Manually written equivalent
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span><span style="color:#66d9ef">char</span> <span style="color:#f92672">*></span> vec;
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// strdup, a POSIX call, makes a new allocation and copies a
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// string into it, here used to turn a static string into one
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// on the heap. We will assume we have a reason to store it
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// on the heap -- perhaps we did more manipulation in the
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// real application to generate the string.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// The allocation is necessary to be the direct equivalent of
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// `vec.push_back("Hi")` or even `vec.emplace_back("Hi")` for
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// a `std::vector<std::string>, because that data structure has
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// the invariant that all strings in the vector must have their
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// own heap allocation (assuming no small string optimization,
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// which many strings are ineligible for).
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>foo <span style="color:#f92672">=</span> strdup(<span style="color:#e6db74">"Hi!"</span>);
</span></span><span style="display:flex;"><span> vec.push_back(foo);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>bar <span style="color:#f92672">=</span> strdup(<span style="color:#e6db74">"Hello!"</span>);
</span></span><span style="display:flex;"><span> vec.push_back(bar);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Use the vector
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Then, later, when we are done with the vector, free all the elements once
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>c: vec) {
</span></span><span style="display:flex;"><span> free(c);
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>The copy version of the C++ code instead does – after de-sugaring the
RAII and value semantics and inlining – something that no programmer
would ever write manually, something equivalent to this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span><span style="color:#75715e">// Desugaring of pre-C++11 version of code
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span><span style="color:#66d9ef">char</span> <span style="color:#f92672">*></span> vec;
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>foo <span style="color:#f92672">=</span> strdup(<span style="color:#e6db74">"Hi"</span>);
</span></span><span style="display:flex;"><span> vec.push_back(strdup(foo)); <span style="color:#75715e">// Why the additional allocate-and-copy?
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> free(foo); <span style="color:#75715e">// Because the destructor of foo will free the original
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>bar <span style="color:#f92672">=</span> strdup(<span style="color:#e6db74">"Hello!"</span>);
</span></span><span style="display:flex;"><span> vec.push_back(strdup(bar));
</span></span><span style="display:flex;"><span> free(bar);
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#75715e">// Use the vec
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span><span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>c: vec) {
</span></span><span style="display:flex;"><span> free(c);
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>C++ without move semantics fails to reach its goal of zero-cost abstraction.
The version with the abstraction, with the value semantics, compiles to code
less efficient than any code someone would write manually, because what
we really want is to allocate the allocation while it’s a local variable
<code>foo</code>, use the same allocation on the vector, and then only free it on the
vector.</p>
<p>The abstractions of only supporting “copy” and “destruct” mean that the
destructor of the variable <code>foo</code> must be called when <code>foo</code> goes out of
scope. This means that the “copy” operation must make an independent
allocation, as it cannot control when the original goes out of scope,
or will be replaced with another value. If we had instead re-used the
same allocation, it would be freed by <code>foo</code>s destructor.</p>
<p>But copying just to destroy the original is silly – silly and
ill-performant. What any programmer would naturally write in that
situation results in a “move”. So this gap – and it was a huge gap –
in C++ value semantics was filled in C++11 when they added a “move”
operation.</p>
<p>Because of this addition, using objects with value semantics that managed
resources became possible. It also became possible to use objects with
value semantics for resources that could not meaningfully be copied,
like unique ownership of an object or a thread handle, while still
being able to get the advantages of putting such objects in collections
and, well, moving them. Shops that previously had to work around value
semantics for performance reasons could now use them directly.</p>
<p>It is not, therefore, surprising that this was for many the most exciting
change in C++11.</p>
<h2 id="how-move-is-implemented-in-c">How Move Is Implemented in C++</h2>
<p>But for now, let’s put ourselves in the place of the language designers
who designed this new move operation. What should this move operation
look like? How could we integrate it into the rest of C++?</p>
<p>Ideally, we would want it to output – after inlining – exactly the
code that we would expect to write manually. When <code>foo</code> is moved into the
vector, the original allocation must not freed. Instead, it is only freed
when the vector itself is freed. This is an absolute necessity to solve
the problem as we must remove a free in order to remove the allocation,
but we also cannot leak memory. If there is to be exactly one allocation,
there must be exactly one deallocation.</p>
<p>Calls to <code>free</code> (or <code>delete[]</code> in my example program) are made in the
destructor, so the most straight-forward way to go forward is to say
that the destructor should only be called when the vector is destroyed,
but not when <code>foo</code> goes out of scope. If <code>foo</code> is moved onto the vector,
then the compiler should take note that it has been moved from, and simply
not call the destructor. The move should be treated as having already
destroyed the object, as an operation that accomplishes both initialization
of the new object (the string on the vector) from the original object and the
destruction of the original object.</p>
<p>This notion is called “destructive move,” and it is how moves are done
in Rust, but it is not what C++ opted for. In Rust, the compiler would
simply not output a destructor call (a “drop” in Rust) for <code>foo</code> because
it has been moved from. But, in fact, the C++ compiler still does. In
destructive move semantics, the compiler would not allow <code>foo</code> to be
read from after the move, but in fact, the C++ compiler still does,
not just for the destructor, but for any operation.</p>
<p>So how is the deallocation avoided, if the compiler doesn’t remove it
in this situation? Well, there is a decision to make here. If an object
has been moved from, no deallocation should be performed. If it has not,
a deallocation should be performed. Rust makes this decision at compile-time
(with rare exceptions where it has to add a “drop flag”),
but C++ makes it at run-time.</p>
<p>When you write the code that defines what it means to move from
an object in C++, you must make sure the original object is in a
run-time state where the destructor will still be called on it, and
will still succeed. And, since we established already that we must save
a deallocation by moving, that means that the destructor must make a
run-time decision as to whether to deallocate or not.</p>
<p>The more C-style post-inlining code for our example would then look
something like this:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span><span style="color:#66d9ef">char</span> <span style="color:#f92672">*></span> vec;
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>foo <span style="color:#f92672">=</span> strdup(<span style="color:#e6db74">"Hi!"</span>);
</span></span><span style="display:flex;"><span> vec.push_back(foo);
</span></span><span style="display:flex;"><span> foo <span style="color:#f92672">=</span> <span style="color:#66d9ef">nullptr</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (foo <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nullptr</span>) {
</span></span><span style="display:flex;"><span> free(foo);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span><span style="display:flex;"><span>{
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">char</span> <span style="color:#f92672">*</span>bar <span style="color:#f92672">=</span> strdup(<span style="color:#e6db74">"Hi!"</span>);
</span></span><span style="display:flex;"><span> vec.push_back(bar);
</span></span><span style="display:flex;"><span> bar <span style="color:#f92672">=</span> <span style="color:#66d9ef">nullptr</span>;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (bar <span style="color:#f92672">!=</span> <span style="color:#66d9ef">nullptr</span>) {
</span></span><span style="display:flex;"><span> free(bar);
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>This null check is hidden by the fact that in C++, <code>free</code> and <code>delete</code>
and friends are defined to be no-ops on null, but it still exists.
And while the check might be very cheap compared to the cost of calling
<code>free</code>, it might not be cheap when things are moved in a tight loop,
where <code>free</code> is never actually called. That is to say, this
run-time check is not cheap compared to the cost of <em>not</em> calling <code>free</code>.</p>
<p>So, given the semantics of move in C++, it results in code that is not
the same as – and not as performant as – the equivalent hand-written
C-style code, and therefore it is not a zero-cost abstraction, and
doesn’t live up to the goals of C++.</p>
<p>Now, it looks like the optimizer should be able to clean up an adjacent
set to null and check for null, but not all examples are as simple as this
one, and, like in many situations where the abstraction relies on the
optimizer, the optimizer doesn’t always get it.</p>
<h2 id="arguing-semantics">Arguing Semantics</h2>
<p>But that performance hit is small, and it is usually possible to optimize
out. If that were the only problem with C++ move semantics, I might find
it annoying, but ultimately I’d say, like about many things in about both
C++ and Rust, something like: Well, this decision was made, remember to
profile, and if you absolutely have to make sure the optimizer got it
in a particular instance, check the assembly by hand.</p>
<p>But there’s a few further consequences of that decision.</p>
<p>First off, the resource might not be a memory allocation, and null pointers
might not be an appropriate way to indicate that that resource doesn’t
exist. This responsibility of having some run-time indication of what
resources need to be freed – rather than a one-to-one correspondence
between objects and resources – is left up to the implementors of classes.
For heap allocations, it is made relatively easy, but the implementor
of the class is still responsible for re-setting the original object.
In my example, the move constructor reads:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>string(string <span style="color:#f92672">&&</span>other) <span style="color:#66d9ef">noexcept</span> {
</span></span><span style="display:flex;"><span> m_len <span style="color:#f92672">=</span> other.m_len;
</span></span><span style="display:flex;"><span> m_str <span style="color:#f92672">=</span> other.m_str;
</span></span><span style="display:flex;"><span> other.m_str <span style="color:#f92672">=</span> <span style="color:#66d9ef">nullptr</span>; <span style="color:#75715e">// Don't forget to do this
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span></code></pre></div><p>The move constructor has two responsibilities, where a destructive
version would only have one: It must set up state for the new object,
and it must set up a valid “moved from” state for the old object.
That second obligation is a direct consequence of non-destructive moves,
and provides the programmer with another chance to mess something up.</p>
<p>In fact, since destructive moves can almost always be implemented by
just copying the memory (and leaving the original memory as garbage
data as the destructor will not be called on it), a default move
constructor would correctly cover the vast majority of implementations,
creating even fewer opportunities to introduce bugs.</p>
<p>But in C++, the moved-from state also has obligations. The destructor
has to know at run-time not to reclaim any resources if the object no
longer has any, but in general, there is no rule that moved-from objects
must immediately be destroyed. The programming language has explicitly
decided not to enforce such a rule, and so, to be properly safe, moved-from
objects must be considered – and must be – valid values for those objects.</p>
<p>This means that any object that manages a resource now must manage either
1 or 0 copies of that resource. Collections are easy – moved from collections
can be made equivalent to the “empty” collection that has no element. For
things like thread handles or file handles, this means that you can have
a file handle with no corresponding file. Optionality is imported to all
“value types.”</p>
<p>So, smart pointer types that manage single-ownership heap allocations, or
any sort of transferrable ownership of heap allocations, now of necessity
must be nullable. Nullable pointers are a serious cause of errors, as
often they are used with the implicit contract that they will not be null,
but that contract is not actually represented in the type. Every time a
nullable pointer is passed around, you have a potential miscommunication
of whether <code>nullptr</code> is a valid value, one that will cause some sort
of error condition, or one that may lead to undefined behavior.</p>
<p>C++ move semantics of necessity perpetuate this confusion. Non-nullable
smart pointers are unimplementable in C++, not if you want them to be
moveable as well.</p>
<h2 id="move-complicatedly">Move, Complicatedly</h2>
<p>This leads me to Herb Sutter’s
<a href="https://herbsutter.com/2020/02/17/move-simply/">explanation of C++ move semantics</a> from his blog. I respect Herb Sutter greatly as someone explaining
C++, and his materials helped me learn C++ and teach it. An explanation
like this is really useful if programming in C++ is what you have to do.</p>
<p>However, I am instead investigating whether C++’s move semantics are
reasonable, especially in comparison to programming languages like Rust
which do have a destructive move. And from that point of view, I think
this blog post, and its necessity, serve as a good illustration of the
problems with C++’s move semantics.</p>
<p>I shall respond to specific excerpts from the post.</p>
<blockquote>
<p>C++ “move” semantics are simple, and unchanged since C++11. But they
are still widely misunderstood, sometimes because of unclear teaching
and sometimes because of a desire to view move as something else instead
of what it is.</p>
</blockquote>
<p>Given the definition he’s about to give of C++ move semantics, I
think this is unfair. The goal of move is clear: to allow resources to be
transferred when copying would force them to be duplicated. It is obvious
from the name. However, the semantics as the language defines them,
while enabling that goal, are defined without reference to that goal.</p>
<p>This is doomed to lead to confusion, no matter how good the teaching is.
And it is desirable to try to understand the semantics as they connect
to the goal of the feature.</p>
<p>To explain what I mean, see the definition he then gives for moving:</p>
<blockquote>
<p>In C++, copying or moving from an object <code>a</code> to an object <code>b</code> sets <code>b</code> to
<code>a</code>’s original value. The only difference is that copying from <code>a</code> won’t
change <code>a</code>, but moving from <code>a</code> might.</p>
</blockquote>
<p>This is a fair statement of C++’s move semantics as defined. But it has
a disconnect with the goals.</p>
<p>In this definition, we are discussing the assignment written as <code>b = a</code>
or as <code>b = std::move(a)</code>. The reason why moving might change <code>a</code>, as
we’ve discussed, is that <code>a</code> might contain a resource. Moving indicates
that we do not wish to copy resources that are expensive or impossible
to copy, and that in exchange for this ability, we give up the right to
expect that <code>a</code> retain its value.</p>
<p>This definition is the correct one to use for reasoning about C++
programs, but it is not directly connected to why you might want
to use the feature at all. It is natural that programmers would
want to be able to reason about a feature in a way that aligns with
its goals.</p>
<p>The goal of this post is to obscure the goal, and to treat
move as if it were a pure optimization of copy, which will not
help a programmer understand why <code>a</code>’s value might change, or why
move-only types like <code>std::unique_ptr</code> exist.</p>
<p>The explanation of the goal of this operation is reserved in this post
for the section entitled “advanced notes for type implementors”.</p>
<p>Of course, almost all C++ programmers in a sufficiently large project
have to become “type implementors” to understand and maintain custom
types, if not to write fresh implementations of them, so I think
most professional programmers should be reading these notes, and
so I think it’s unfair to call them advanced. But beyond that,
this explanation is core to why the operation exists, and the only
explanation for why move-only types exist, which all C++ programmers
will have to use:</p>
<blockquote>
<p>For types that are move-only (not copyable), move is C++’s closest
current approximation to expressing an object that can be cheaply moved
around to different memory addresses, by making at least its value
cheap to move around.</p>
</blockquote>
<p>He follows up with an acknowledgement that destructive moves are
a theoretical possibility:</p>
<blockquote>
<p>(Other not-yet-standard proposals to go further
in this direction include ones with names like “relocatable” and
“destructive move,” but those aren’t standard yet so it’s
premature to talk about them.)</p>
</blockquote>
<p>For his purposes, this is extremely fair, but since my purposes are to
compare C++ to Rust and other programming languages which have destructive
moves, it is not premature for me to talk about them.</p>
<p>This gets more interesting in the Q&A.</p>
<blockquote>
<p>How can moving from an object not change its state?</p>
<p>For example, moving an int doesn’t change the source’s value because
an int is cheap to copy, so move just does the same thing as copy. Copy
is always a valid implementation of move if the type didn’t provide
anything more efficient.</p>
</blockquote>
<p>Indeed, for reasons of consistency and generic programming, move is
defined on all types that can be moved or copied, even types that don’t
implement move differently than copy.</p>
<p>What makes this confusing in C++, however, is that types that manage
resources might be written without an implementation of move. They might
pre-date the move feature, or their implementor might not have understood
move well enough to implement them, or there might be a technical reason
why moving couldn’t be implemented in a way that elides the resource
duplication. For these types, a move falls back on a copy, even if the
copy does significant work. This can be surprising to the programmer,
and surprises in programming are never good. More direly, there
is no warning when this happens, because the notion of resource management
is not referenced in the semantics.</p>
<p>In Rust, a move is always implemented by copying the data in the object
itself and then not destructing the original object, and never by copying
resources managed by the object, or running any custom code.</p>
<blockquote>
<p>But what about the “moved-from” state, isn’t it special somehow?</p>
<p>No. The state of a after it has been moved from is the same as the
state of a after any other non-const operation. Move is just another
non-constfunction that might (or might not) change the value of the
source object.</p>
</blockquote>
<p>I disagree in practice. For objects that use move as intended, to avoid
copying resources, move will (at least usually) drain its resource. This
means that an object that often manages a resource will enter a state in
which it is not managing a resource. That state is special, because it is
the state when a resource-managing object is doing something other than
its normal job, and is not managing a resource. This is not a “special
state” by any rigorous definition, but is guaranteed to be intuitively
special by virtue of being resource-free. (It is also a special state
in that the value is unspecified in general, whereas most of the time,
the value is specified.)</p>
<p>Collections can, as I said before, get away with becoming the
empty collection in this scenario, but even for those, the empty
state is special: It is the only state that can be represented without
holding a resource. And many other types of objects cannot even
do this. <code>std::unique_ptr</code>’s moved-from state is the null pointer,
and without these move semantics, it would be possible to design a
<code>std::unique_ptr</code> that did not have a null state.</p>
<p>Once <code>std::unique_ptr</code> is forced to be allowed to have null values, it
makes sense that there be other ways to create a null <code>std::unique_ptr</code>,
e.g. by default-constructing it. But it is the design of move semantics
that force it to have a null value in the first place.</p>
<p>Put another way: <code>std::unique_ptr</code> and thread handles are therefore
collections of 0 or 1 heap allocation handles or thread handles, and
once defined that way, the “empty” state is not special, but it is move
semantics that force them to be defined that way.</p>
<blockquote>
<p>Does “but unspecified” mean the object’s invariants might not hold?</p>
<p>No. In C++, an object is valid (meets its invariants) for its entire
lifetime, which is from the end of its construction to the start of its
destruction…. Moving from an object does not end its
lifetime, only destruction does, so moving from an object does not make
it invalid or not obey its invariants.</p>
</blockquote>
<p>This is true, as discussed above. The moved-from object must be able to
be destructed, and there is nothing stopping a programmer for instead
doing something else with it. Given that, it must be in some state that
its operations can reckon with. But that state is not necessarily
one that would be valid if move semantics didn’t force its conclusion,
and so again, we are close to the problem.</p>
<blockquote>
<p>Does “but unspecified” mean the only safe operation on a moved-from
object is to call its destructor?</p>
<p>No.</p>
<p>Does “but unspecified” mean the only safe operation on a moved-from
object is to call its destructor or to assign it a new value?</p>
<p>No.</p>
<p>Does “but unspecified” sound scary or confusing to average
programmers?</p>
<p>It shouldn’t, it’s just a reminder that the value might have changed,
that’s all. It isn’t intended to make “moved-from” seem mysterious
(it’s not).</p>
</blockquote>
<p>I disagree firmly with the answer to the last question. “Unspecified”
values are extremely scary, especially to programmers on team projects,
because it means that the behavior of the program is subject to arbitrary
change, but that change will not be considered breaking.</p>
<p>For example, <code>std::string</code> does not make any promises about the contents
of a moved-from string. However, a programmer – even a senior programmer
– may, instead of consulting the documentation, write a test program
to find out what the value is of a moved-from string. Seeing an empty
string, the programmer might write a
<a href="https://www.thecodedmessage.com/string_example/split.cpp">program</a> that relies on the string
being empty:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-c++" data-lang="c++"><span style="display:flex;"><span>std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span>std<span style="color:#f92672">::</span>string<span style="color:#f92672">></span>
</span></span><span style="display:flex;"><span>split_into_chunks(<span style="color:#66d9ef">const</span> std<span style="color:#f92672">::</span>string <span style="color:#f92672">&</span>in) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">int</span> count <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>vector<span style="color:#f92672"><</span>std<span style="color:#f92672">::</span>string<span style="color:#f92672">></span> res;
</span></span><span style="display:flex;"><span> std<span style="color:#f92672">::</span>string acc;
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">for</span> (<span style="color:#66d9ef">char</span> c: in) {
</span></span><span style="display:flex;"><span> <span style="color:#66d9ef">if</span> (count <span style="color:#f92672">==</span> <span style="color:#ae81ff">4</span>) {
</span></span><span style="display:flex;"><span> res.push_back(std<span style="color:#f92672">::</span>move(acc));
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Don't need to clear string.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> <span style="color:#75715e">// I checked and it's empty.
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span> count <span style="color:#f92672">=</span> <span style="color:#ae81ff">0</span>;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span> acc <span style="color:#f92672">+=</span> c;
</span></span><span style="display:flex;"><span> }
</span></span><span style="display:flex;"><span>}
</span></span></code></pre></div><p>Of course, you should not do that. A later version of <code>std::string</code>
might implement the small string optimization, where strings of
below a certain size are not stored in an expensive-to-copy heap
resource, but in the actual object itself. In that situation, it would
be reasonable to implement move as a copy, which is allowed, and then
this program would no longer do the same thing.</p>
<p>But this is a surprise. This is a result of the “unspecified value.”
And so while it may, strictly speaking, be “safe” to do things
with a moved-from object other than destruct them or assign to them,
in practice, without documentation to the contrary making stronger
guarantees, the only way to get “not surprising” behavior is to
greatly limit what you do with moved-from objects.</p>
<blockquote>
<p>What about objects that aren’t safe to be used normally after being moved
from?</p>
<p>They are buggy….</p>
</blockquote>
<p>By this definition, <code>std::unique_ptr</code> should likely be considered buggy,
as null pointers cannot be used “normally”. Similarly, a <code>std::thread</code>
object that does not represent a thread handle. It is only by stretching
the definition of “used normally” to include these special “empty values”
that <code>std::unique_ptr</code> gets to claim to not be buggy under that definition,
although a null pointer simply cannot be used the way a normal pointer
can.</p>
<p>Again, this attitude, that a null pointer is a normal pointer, that an
empty thread handle is a normal type of thread handle, is adaptive to
programming C++. But it will inevitably exist in a programmer’s blind
spot, as null pointers always have. The “not null” invariant is often
expressed implicitly. Many uses of <code>std::unique_ptr</code> are relying on
them never being null, and simply leave this up to the programmer
to ensure.</p>
<p>Herb Sutter himself discusses this:</p>
<blockquote>
<p>Since the problem is that we are not expressing the “not null”
invariant, we should express that by construction — one way is
to make the pointer member a <code>gsl::not_null<></code> (see for example the
Microsoft GSL implementation) which is copyable but not movable or
default-constructible.</p>
</blockquote>
<p>In a programming language with destructive moves, it would be possible
to have a smart pointer that was both “non-null” and movable. If we
need both movability and the ability to express this invariant in the
type system, well, C++ cannot help us.</p>
<blockquote>
<p>But what about a third option, that the class intends (and documents) that
you just shouldn’t call operator< on a moved-from object… that’s a
hard-to-use class, but that doesn’t necessarily make it a buggy class,
does it?</p>
<p>Yes, in my view it does make it a buggy class that shouldn’t pass
code review.</p>
</blockquote>
<p>But in a sense, this is exactly what <code>std::unique_ptr</code> is. It has a
special state where you cannot call its most important operator, the
dereference operator. It only avoids being called buggy because it
expands this state so it can be arrived at by other means.</p>
<p>Again, everything Herb Sutter says is true in a strict sense. It is
memory-safe to use moved-from objects other than to destroy or
assign to them, even if the move operation makes no further guarantees.
It simply isn’t safe in a broader sense, in that it will have surprising,
changeable behavior. It is true that the null pointer is a valid
value of <code>std::unique_ptr</code>, but smart pointers that implement move
are forced to have such a value.</p>
<p>And therefore, it should not be surprising that these questions come
up. The misconceptions that Herb Sutter is addressing are an unfortunate
consequence of the dissonance between the strict semantics of
the programming language, where his statements are true, and the
practical implications of how these features are used and are
intended to be used, where the situation is more complicated.</p>
<h2 id="moves-in-rust">Moves in Rust</h2>
<p>So the natural follow-up question is, how does Rust handle move semantics?</p>
<p>First off, as mentioned before, Rust makes a special case for types
that do not need move semantics, where the value itself contains all
the information necessary to represent it, where no heap allocations
or resources are managed by the value, types like <code>i32</code>. These types
implement the special <code>Copy</code> trait, because for these types, copying
is cheap, and is the default way to pass to functions or to handle
assignments:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">foo</span>(bar: <span style="color:#66d9ef">i32</span>) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Implementation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> var: <span style="color:#66d9ef">i32</span> <span style="color:#f92672">=</span> <span style="color:#ae81ff">3</span>;
</span></span><span style="display:flex;"><span>foo(var); <span style="color:#75715e">// copy
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo(var); <span style="color:#75715e">// copy
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo(var); <span style="color:#75715e">// copy
</span></span></span></code></pre></div><p>For types that are not <code>Copy</code>, such as <code>String</code>, the default function
call uses move semantics. In Rust, when a variable is moved from, that
variable’s lifetime ends early. The move replaces the destructor call
at the end of the block, at compile time, which means it’s a compile
time error to write the equivalent code for <code>String</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">foo</span>(bar: String) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Implementation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> var: String <span style="color:#f92672">=</span> <span style="color:#e6db74">"Hi"</span>.to_string();
</span></span><span style="display:flex;"><span>foo(var); <span style="color:#75715e">// Move
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo(var); <span style="color:#75715e">// Compile-Time Error
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo(var); <span style="color:#75715e">// Compile-Time Error
</span></span></span></code></pre></div><p><code>Copy</code> is a trait, but more entwined with the compiler than most traits.
Unlike most traits, you can’t implement it by hand, but only by deriving
from primitive types that implement copy. Types like <code>Box</code>, that manage
a heap allocation, do not implement copy, and therefore structs that
contain <code>Box</code> also cannot.</p>
<p>This is already an advantage to Rust. C++ pretends that all types are the
same, even though they require different usage patterns in practice. You
can pass a <code>std::string</code> by copy just like an <code>int</code>. Even if you have a
vector of vectors of strings, you can pass by copy and that’s usually the
default way to pass it – moves in many cases require explicit opt-in. For
<code>int</code> it’s a reasonable default, but for collections types it isn’t,
and in Rust the programming language is designed accordingly.</p>
<p>If you want a deep copy, you can always explicitly ask for it with
<code>.clone()</code>:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-rust" data-lang="rust"><span style="display:flex;"><span><span style="color:#66d9ef">fn</span> <span style="color:#a6e22e">foo</span>(bar: String) {
</span></span><span style="display:flex;"><span> <span style="color:#75715e">// Implementation
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>}
</span></span><span style="display:flex;"><span>
</span></span><span style="display:flex;"><span><span style="color:#66d9ef">let</span> var: String <span style="color:#f92672">=</span> <span style="color:#e6db74">"Hi"</span>.to_string();
</span></span><span style="display:flex;"><span>foo(var.clone()); <span style="color:#75715e">// Copy
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo(var.clone()); <span style="color:#75715e">// Copy
</span></span></span><span style="display:flex;"><span><span style="color:#75715e"></span>foo(var); <span style="color:#75715e">// Move
</span></span></span></code></pre></div><p>What this actually does is create a clone, or a deep copy, and then
move the clone, as <code>foo</code> takes its parameter by move, the default for
non-<code>Copy</code> types.</p>
<p>What does a move in Rust actually entail? C++ implements moves
with custom-written move constructors, which collections and other
resource-managing types have to implement in addition to implementing
copying (though automatic implementation is available if building out of
other movable types). Rust requires implementations for clone, but for
all moves, the implementation is the same: copy the memory in the value
itself, and don’t call the destructor on the original value. And in Rust,
all types are movable with this exact implementation – non-movable types
don’t exist (though non-movable values do). The bytes encode information
– such as a pointer – about the resource that the value is managing,
and they must accomplish that in the new location just as well as they
did in the old location.</p>
<p>C++ can’t do that, because in C++, the implementation of move has to
mark the moved-from value as no longer containing the resource. How this
marking works depends on the details of the type.</p>
<p>But even if C++ implemented destructive moves, some sort of “move
constructor” or custom move implementation would still be required.
C++, unlike Rust, does not require that the bytes contained in an
object mean the same thing in any arbitrary location. The object could
contain a reference to itself, or to part of itself, that would be
invalidated by moving it. Or, there could be a data structure somewhere
with a reference to it, that would need to be updated. C++ would have
to give types an opportunity to address such things.</p>
<p>Safe Rust forbids these things. The lifetime of a value takes moves into
account; you can’t move from a value unless there are no references
to it. And in safe Rust, there is no way for the user to create a
self-referential value (though the compiler can in its implementation
of <code>async</code> – but only if the value is already “pinned,” which we will
discuss in a moment).</p>
<p>But even in unsafe Rust, such things violate the principle of move.
Moving is always safe, and unsafe Rust is always responsible for keeping
safe code safe. As a result, Rust has a mechanism called “pinning” that
indicates, in the type system, that a particular value will never move
again, which can be used to implement self-referential values and which
is used in <code>async</code>. The details are beyond the scope of this blog post,
but it does mean that Rust can avoid the issue of move semantics for
non-movable values without ruining the simplicity of its move semantics.</p>
<p>For these rare circumstances, the features of moving can be accomplished
by indirection, and using a <code>Box</code> that points to a pinned value on
the heap. And there is nothing stopping such types from implementing a
custom function which effectively implements a custom move by consuming
the pinned value, and outputs a new value, which can then be pinned
in a different location. There is no need to muddy the built-in move
operation with such semantics.</p>
<h2 id="practical-implications-for-c-programmers">Practical Implications for C++ Programmers</h2>
<p>So, obviously, in light of my blog series, I recommend using Rust
over C++. For Rust users, I hope this clarifies why the move semantics
are the way they are, and why the <code>Copy</code> trait exists and is so important.</p>
<p>But of course, not everyone has the choice of using Rust. There are a
lot of large, mature C++ codebases that are well-tested and not going
away anytime soon, and many programmers working on those codebases.
For these programmers, here is some advice for the footgun
that is C++ move semantics, both based on what we’ve discussed, and
a few gotchas that were out of the scope of this post:</p>
<ul>
<li>Learn the difference between rvalue, lvalue, and forwarding references.
Learn the rules for how passing by value works in modern C++. These
topics are out of the scope of this blog post, but they are core parts
of C++ move semantics and especially how overloading is handled in
situations where moves are possible. Scott Meyers’s <em>Effective Modern
C++</em> is an excellent resource.</li>
<li>Move constructors and assignment operators should always be <code>noexcept</code>.
Otherwise, <code>std::vector</code> and many other library utilities will simply
ignore them. There is no warning for this.</li>
<li>The only sane things to do with most moved-from objects are to
immediately destroy it or reset its value. Comment about this in your
code! If the class specifically defines that moved-from values are
empty or null, note that in a comment too, so that programmers don’t
get the impression that there are any guarantees about moved-from
values in general.</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>Move semantics are essential to the performance of modern C++. Without
them, much of its standard library would become much more difficult
to use. However, the specific design of moves in C++:</p>
<ul>
<li>is misaligned with the purpose of moving</li>
<li>fails to eliminate all run-time cost</li>
<li>surprises programmers, and</li>
<li>forces designers of types to implement an “empty-yet-valid” state</li>
</ul>
<p>Why, then, does C++ use such a definition? Well, C++ was not originally
designed with move semantics in mind. Proposals to add destructive move
do not interact well with the existing language semantics.
One <a href="https://www.foonathan.net/2017/09/destructive-move/">interesting blog post</a>
that I found even says, when following through on the consequences
of adding destructive move semantics:</p>
<blockquote>
<p>… if you try to statically detect such situations, you end up with Rust.</p>
</blockquote>
<p>C++ has so many unsafe features and so many existing mechanisms, that
this was deemed the most reasonable way to add move semantics to C++, harmful
as it is.</p>
<p>And perhaps this decision was unnecessary. Perhaps there was a way –
perhaps there still is a way – to add destructive moves to C++. But
for right now, non-destructive moves are the ones the maintainers of
C++ have decided on. And even if destructive moves were added, it’s unlikely
that they’d be as clean as the Rust version, and the existing non-destructive
moves would still have to be supported for backwards-compatibility sake.</p>
<p>In any case, Rust has taken this opportunity to learn from existing
programming languages, and to solve the same problems in a cleaner,
more principled way. And so, for the move semantics as well as for the
syntax, I recommend Rust over C++.</p>
<p>And to be clear, this still has very little to do with the safety features
of Rust. A more C++-style language with no <code>unsafe</code> keyword and no safety
guarantees could have still gone the Rust way, or something similar to
it. Rust is not just a safer alternative to C++, but, as I continue to
argue, unsafe Rust is a better unsafe language than C++.</p>
Sayonara, C++, and hello to Rust!https://www.thecodedmessage.com/posts/hello-rust/2021-10-26T00:00:00+00:00This past May, I started a new job working in Rust. I was somewhat skeptical of Rust for a while, but it turns out, it really is all it’s cracked up to be. As a long-time C++ programmer, and C++ instructor, I am convinced that Rust is better than C++ in all of C++’s application space, that for any new programming project where C++ would make sense as the programming language, Rust would make more sense.<p>This past May, I started a new job working in Rust. I was somewhat
skeptical of Rust for a while, but it turns out, it really is all it’s
cracked up to be. As a long-time C++ programmer, and C++ instructor,
I am convinced that Rust is better than C++ in all of C++’s application
space, that for any new programming project where C++ would make sense
as the programming language, Rust would make more sense.</p>
<h2 id="what-rust-is-not-for">What Rust is not for</h2>
<p>Before going into more detail about why I think that, I’d like to throw
out a few caveats, so you know I’m a reasonable person and not just an
extremist fanboy.</p>
<p>Caveat the first: Note that I said this about <em>new</em> programming
project. There are some people on the Internet who demand the re-writing
of all existing C and C++ projects in Rust, and while I think Rust is a
better language for new projects, and that many existing projects should
seriously consider integrating it, I realize, like a reasonable mature
programmer, that for most existing projects, a Rust re-write would be
a prohibitively expensive rabbit hole. In short, Rust is not so amazing
that it will protect you from second system syndrome or from the perils
of a complete rewrite. It is but a mortal programming language.</p>
<p>That said, new Rust versions of aging C and C++ projects are often
very worthwhile and exciting, like new versions of many aging projects
can be. It’s just not a magical exception to basic economics.</p>
<p>Caveat the second: Note also that I said “where C++ would make sense.”
Rust has a lot of enthusiastic fans, and so there are a lot of people
learning Rust expecting the magic when they first learned their favorite
programming language. And what they find is a programming language that
requires lots of arcane rules, where everything seems rather tedious,
and where a lot of their favorite features don’t exist.</p>
<p>Rust is a systems programming language. It is not garbage collected,
meaning you do have to manually manage memory. While Rust makes it
much harder to do that egregiously wrong, it’s still a very hard
problem, and there are trade-offs that Rust – unlike GC’d languages
– refuses to make for you. Meanwhile, like C++, the emphasis is on
performance (or at least control over performance), whether latency
or throughput or memory footprint. Rust is trying to make sure that all
its organizational abstractions have no run-time cost, or, if they do,
to make sure it’s abundantly clear exactly what that cost is. If you
are a systems programmer, if you are used to C and C++ and to trying to
solve systems programming types of problems, Rust is magical, just like
when you learned your previous favorite programming language.</p>
<p>If you are not, Rust is overkill for your task at hand and you shouldn’t
be using it. I earnestly recommend Haskell.</p>
<p>For more clarity on what I mean by systems programming: If you write
Python or JavaScript or Ruby, then you’re running the code in a Python
interpreter, in Node or a web browser, in the Ruby interpreter, all on
top of an operating system with an operating system kernel. Rust doesn’t
replace those tools. Rather, the Python interpreter, the web browser,
and Node, and even the kernel, are programs written in C or C++, and
Rust replaces that. It’s a whole ’nother level of programming,
where you have manage the actual hardware.</p>
<h2 id="purpose-of-this-series">Purpose of this series</h2>
<p>But enough of what Rust is and is not for. It is an excellent systems
programming language, and one that was a long time coming. I plan
on writing several posts about Rust features, why they’re an improvement
upon C++ features, and why Rust is a better, more modern programming
language. Mostly, this will be a discussion of why Rust is better than
C++, which I think is the most comparable existing programming language,
but it will also touch on why Rust is an improvement on C.</p>
<p>Because of this C++ focus, this series will at times be as much or
more a criticism of C++ as it is a commendation of Rust. I think that
is unavoidable, as this type of criticism of C++ is most truly credible
when an alternative is available, and similarly, Rust is most practically
evaluated in terms of its most viable alternative. Unfortunately, that also
means that I’ll assume some level of familiarity with C++, but hopefully
not too much.</p>
<p>I know that this is a much-discussed topic. Perhaps this is the Rust
equivalent of the dreaded Haskell monad tutorial, where every person
new to the programming language excitedly writes the same thing, and so
thank you for reading. I’m going to try and avoid the obvious tropes:
I’m going to try to do more than simply beat the table about type-safety
and memory safety and avoiding undefined behavior – though of course
these topics will come up. In fact, I had until rather recently simply
assumed that the safety of Rust would lead to unacceptable performance
degradation, that Rust might be well and good for some applications but
could unfortunately never be useful in a true low-latency environment. I
had to be persuaded that memory safety wasn’t a downside in such contexts,
that Rust could truly be a competitor to C or C++ and not just to Go
or Swift.</p>
<h2 id="the-syntax-of-c">The syntax of C++</h2>
<p>So for today, I’m going to ignore memory safety completely. Even assuming
that C++ was more or less right that performance and optimization requires
a broad range of undefined behaviors, there were still problems with
C++ that left me regularly begging for at least a syntactic rewrite.
As Bjarne Stroustrup, creator of C++, famously said: “Within C++, there
is a much smaller and cleaner language struggling to get out.”</p>
<p>He later clarified that he wasn’t talking about a streamlined GC’d
language like Java, and of course he is aware of Rust and still on the C++
train. As he clarified, he was talking about the syntax of C++, and the
legacy of C. But just that category, just syntax, is I think enough to
justify a do-over of C++. I fantasized continually about a new syntax –
with identical semantics in my mind – that could be migrated to in a
file-by-file basis, with its own file extension. This, I realize now,
would in practice make for a new language, and a good opportunity to
introduce modern typing in it, and in Rust, I see my hope realized,
if a little more inconveniently than I imagined.</p>
<p>So that is what I want to focus on for the rest of this post: Why C++
syntax is rotten. Analyses of other Rust features I will reserve for
future posts.</p>
<p>Many of C++’s syntactic foibles have to do with its C heritage. This is
not to smear C: the same features that make sense in a simple “portable
assembly” like C begin to break down when they are preserved with
almost-identical syntax and naively extended semantics in a language that
promises powerful generic programming features that assist in automatic
code generation, resource management, and memory safety.</p>
<h2 id="header-files">Header Files</h2>
<p>This is really clear in my first example: header files. In C, they
serve two purposes. For the programmer, they allow a separation of
interface and implementation, especially when considering that modern
IDE technology did not exist when C was developed. The header file shows
the external interface of how to use the module, and the C file shows
how it is implemented.</p>
<p>For the compiler, this arrangement simplifies implementation. The information
necessary to compile each module is all included in the C file for the
module plus all the headers included by it (and included by them, etc.).
None of the other <code>.c</code> C files need be consulted, only the much smaller <code>.h</code>
headers, in a practice known as <em>separate compilation</em>.</p>
<p>The problem when this is extended to C++ is templates. These are essentially
macros, in that they allow the on-demand generation of new code, based on
what is going on in the client code. So if we imagine that the module <code>broadcast</code>
depends on the module <code>connection</code>, and that the compiler is currently compiling
<code>broadcast</code>, it would not only be necessary to investigate the interface to
<code>connection</code>, but also the complete implementation of the templates.</p>
<p>If we are to preserve the programmers’ perspective, and keep only the
interface in the header files, this means that “separate compilation”
is broken and that the compiler would need to fish around in the main
C++ files. If we instead preserve the concept of separate compilation,
the headers are no longer about just interfaces, but also implementation
details.</p>
<p>The inventors of C++ decided to preserve the concept of separate compilation,
and in the laziest way possible, literally the exact same implementation of C rather
than trying to apply the same principles and goals, and re-engineering something
better.</p>
<p>So now, we have the situation for the programmer where the header file contains
a duplicate of the interface, in addition to the implementation of all functions
that happen to be templates. To a compiler, a function and a function
template are very different things, but to a user, functions go back and forth
between template and not all the time, requiring the programmer to move them
between files.</p>
<p>Why does this distinction exist at all in C++? Computers are much faster now.
The compiler, to preserve efficiency, could automatically extract its own
binary file of what information it needs from each module to compile other
modules. The compiler could do this work for us. It is doing far more work
in all of its optimization steps.</p>
<p>C is famously a portable syntax for writing assembly. In C, the
information needed by the compiler, the <em>application binary interface</em>
or ABI, is exactly the same as the interface needed by the programmer,
the <em>application programming interface</em> or API – or at least very
close to it. And so in C, the concept of header files makes sense,
if unnecessary with modern compiler technology.</p>
<p>And to be clear, templates are just the most egregious example of
non-interface code needed from other modules: bodies of inline functions
and private member variables in classes are also not part of the interface
from a programming perspective, but part of the binary interface, part
of what the compiler needs to know about a module to compile the other
modules that depend on it.</p>
<p>Now, you may think, why is this such a big deal? Why such a complaint about
the inconvenience of moving things between files, or duplicating some
information? Why does it matter if the rules of which things to put
in which file are on the arcane side? You might imagine, so what?
You’ll mess it sometimes, the compiler will issue an error, you’ll
say “oh, right” and move the code to the appropriate location.</p>
<p>To such objections I say: You don’t know C++. Unfortunately, I’ve
seen this attitude taken by professional C++ programmers, who were
careless with header files, moving code around in bulk in a way that
was liable to break this particular set of arcane rules, and accusing
me of overreacting and wasting time when I objected to this.</p>
<p>For those who haven’t had the misfortunate of finding this out the
hard way, when you break an arcane rule in C++ – even rules that have
nothing to do with run-time behavior or memory safety – you are lucky
if you get away with a simple compiler error – or even the somewhat
more common arcane, incomprehensible compiler error. Unfortunately,
the result is regularly no error at all. The compiler cannot tell,
from its separate compilation point of view, if the information
provided in the headers is consistent. One module might import
one version of a header, and another module might import another.</p>
<p>This may sound unlikely, but many codebases have the practice of
separating out the actual interface in one header, and the template
implementations in another. At this point, it becomes important which
one is included, especially because “template specializations” mean
that additional template code doesn’t just make more templates
available, but changes the meaning of existing templates.</p>
<p>If the templates included in different compilation units are inconsistent,
the result is undefinned behavior, and the program might potentially do
anything. Unfortunately, this also means the behavior might switch arbitrarily
between different compiler versions, different compiler vendors, or based on
seemingly unrelated permutations. Unpredictable behavior changes lead to bugs
and security vulnerabilities.</p>
<p>Worse, header files are implemented by textual inclusion. The compiler
proceeds as if the contents of the header were literally included in the
module that imports them. Cycles of inclusion don’t result in error, but
instead, a header is simply not included (if common precautions are made)
when the second recursion happens.</p>
<p>Thus: Imports via header files are sensitive to ordering. A
seemingly-innocuous change, like alphabetizing the included header files
in each module, can break builds or change behavior. Such a change rolled
out over an entire company’s codebase can be disastrous, and take many
programmer-months to unravel the consequences of. Ask me how I know.</p>
<p>So it should make sense that the first concept I had for my “new syntax”
for C++ was that header files should be auto-generated from source files,
preferably in a pre-compiled binary format.</p>
<p>This would be an implementation detail of the build artifact, maintained
a build directory, and be a compiler-specific optimization in favor of
better compilation times. Semantically, rather than textual inclusion,
there would simply be a declaration in one module to say that another
module’s public interface could be used, where order wouldn’t
matter.</p>
<p>Rust is a modern programming language, and the Rust <code>use</code> directive does
in fact work that way.</p>
<p>This isn’t particularly a special point about Rust. This would be the
obvious way to construct any new programming language. The compiler
doesn’t need header files – the C preprocessor that implements <code>#include</code>
directives, along with the rule that functions and structures must be
declared before use, is a hold-over from the time when compilers ran on
computers slower than a modern thermostat. And programmers don’t need
them either: A better place for interfaces to be put in a separate file
would be automatically-generated documentation.</p>
<p>So Rust here gets points for doing what any sensible modern programming
language would do, and C++ loses points for carrying over an implementation
detail from C to a context where it no longer makes any sense.</p>
<h2 id="syntax-and-layout">Syntax and Layout</h2>
<p>Since we’re talking about the syntax of C++, I wanted to touch on something
very basic but very serious: basic syntax for control structures. C and its
syntactic descendants, including C# and Java, use something like this for
<code>if</code>-statements and <code>for</code>-statements:</p>
<pre tabindex="0"><code>if (!foo.is_empty()) {
spin_up_thread(foo);
destroy(&bar);
}
do_something_else();
</code></pre><p>In this example, the calls to <code>spin_up_thread</code> and <code>destroy</code> are inside
the <code>if</code> statement, and only happen if <code>foo</code> is indeed non-empty. The
call to <code>do_something_else</code> is not part of the <code>if</code> statement.</p>
<p>How do we know that? Well, the compiler knows that because after the <code>if</code>
statement there is an opening brace, and so all statements are included
until the matching closing brace, including the two mentioned. But,
depending on how fast we’re skimming the code, we probably know that
because the <code>spin_up_thread</code> and <code>destroy</code> calls are indented.</p>
<p>In this situation, in what will be a recurring theme in this comparison,
the compiler and the programmer are getting their information from
different places. Therefore, the compiler and the programmer can disagree,
especially as braces aren’t mandatory, and if omitted indicate that only
the first subsequent statement is included:</p>
<pre tabindex="0"><code>if (!foo.is_empty())
spin_up_thread(foo);
destroy(&bar); // Warning: This is done unconditionally
do_something_else();
</code></pre><p>This looks like it only destroys <code>&bar</code> conditionally, and to a human
following the indentation in code review or casual reading, that’s
exactly what you would expect. But there’s no braces and the compiler,
for whatever reason, ignores the same whitespace that human readers
rely on.</p>
<p>This has come up in personal projects of mine, usually when collaborating
with someone else. Even if you make the personal discipline of always
including the braces <code>{</code> around the body of your if-statements <code>}</code>, someone
else might not have that discipline, and therefore, you might be exposed
to this intermediate-state code:</p>
<pre tabindex="0"><code>if (!foo.is_empty())
spin_up_thread(foo);
do_something_else();
</code></pre><p>Needing to add a call to <code>destroy(&bar)</code> in the condition, after <code>spin_up_thread</code>,
you find the line, add a new line at the same indentation level, and simply fail
to notice that the new line is not actually wrapped in any <code>{</code>.</p>
<p>This was, of course, the direct cause of a <a href="https://arstechnica.com/information-technology/2014/02/extremely-critical-crypto-flaw-in-ios-may-also-affect-fully-patched-macs/">major security vulnerability in iOS
and macOS</a>:</p>
<pre tabindex="0"><code>if (some_err_condition)
goto fail;
goto fail;
</code></pre><p>Since humans use indentation to read code, and to determine what is in a block
and what isn’t, I would’ve wanted my wish-list “new C++ syntax” programming
language to take a page from Python and use significant whitespace:</p>
<pre tabindex="0"><code>if !foo.is_empty():
spin_up_thread(foo)
destroy(&bar)
do_something_else()
</code></pre><p>Rust is a minor disappointment in this department. It stuck to braces,
and whitespace being “insignificant.” But it made a huge improvement,
far outweighing my disappointment: Rust at least prevents the <code>goto fail</code> scenario by making braces mandatory, helping ergonomics by
instead removing bracketing around the condition. Having the body of the
<code>if</code>-statement without brackets is simply not worth it as a short-cut,
but if the braces are mandatory, then the parentheses aren’t necessary:</p>
<pre tabindex="0"><code>if !foo.is_empty() {
spin_up_thread(foo);
destroy(bar);
}
do_something_else();
</code></pre><p>This is better because then the <code>goto fail</code> example would still be glaringly obviously
failsome, because even if the indentation does not match the braces, the braces still
have to go somewhere, and will jump out at you:</p>
<pre tabindex="0"><code>if some_err_condition {
goto fail; }
goto fail;
</code></pre><p>It disappoints me that these issues tend to be dismissed as “matters
of taste,” because as Apple learned, there are actual consequences to
this misalignment of what programmers pay attention to and what the
compiler pays attention to. I would have liked Rust to go the whole
way, and remove altogether this strange concept that whitespace should
be insignificant, a concept that my oldest C and C++ books exclaimed
as a great feature without explanation or justification. But at least
Rust has fixed the most egregious consequences of C++ syntax. Again,
this problem with C++ comes from a feature inherited from C, but in this
case C was just wrong to begin with, and should’ve done it the Rust way
(or the Python way) from the very start.</p>
<p>Additionally, Rust has good auto-formatting, which unlike C++ auto-formatting
tools, do not break code (by, for example, re-ordering headers). This fundamentally
replaces the whitespace provided by the programmer – which might be misleading to
other programmers – with whitespace that aligns with the compiler’s interpretation,
and therefore is correct to rely on when skimming. A good <code>cargo fmt</code> should therefore
be run before every code review, to make sure that the code can be easily and correctly
read.</p>
<h2 id="c-isms-vs-modern-c">C-isms vs “Modern C++”</h2>
<p>I then have one more topic before I wrap up syntax. C++ programmers
nowadays are telling everyone who was upset with their language in
the 90’s and aughts that it’s better now, that C++ has cleaned up its
act. C++11 really has changed a lot, and C++ is innovating again, and
that’s very good. C++ is full of new features, and part of its claim to
be a modern programming language involves claiming that programming in
C++ is good, if you use these new features.</p>
<p>Smart pointers, written <code>std::unique_ptr<Foo></code>, allow automatic
implementation of construction and destruction for owning pointers, and
allows much clearer communication about ownership semantics in function
signatures, and so is preferable to writing the C-style <code>Foo *</code>. C++
arrays, <code>std::array<Foo, 12> arr;</code>, act like any other STL collection,
allow them and their iterators to be passed to standard templates that
expect STL interfaces, and provide a number of useful features as methods,
and so using them is preferable to the C-style <code>Foo arr[12];</code>.
<code>static_cast<A>(b)</code> is much more specific, and therefore less prone to
accident, than the C-style <code>(A)b</code>.</p>
<p>These are among the features that are trotted out whenever someone says
they used C++ in the 90’s and it had all these problems. These features
are among the ones used to claim that C++ is a better, cleaner, tidier,
more modern programming language than it used to be. Whether or not
they’ve done enough to replace their old counter-parts – they’re
generally preferred whenever possible.</p>
<p>The problem? Convenience. Who wants to type <code>std::unique_ptr<Foo></code> when
instead you can write <code>Foo *</code>? Why are the somewhat-deprecated options
the easy ones to write? Why isn’t it something like <code>std::raw_ptr<Foo></code>
with some convenient notation for <code>std::unique_ptr</code>?</p>
<p>But of course, that would break compatibility with C, and with earlier
versions of C++.</p>
<p>I don’t want to get into the myriad reasons why smart pointers are to be
preferred to raw pointers, or why raw pointers occupy such an awkward place
in C++ – those are topics for a future post. But for however many seemingly-principled
reasons some of my colleagues might state for why they used raw pointers in
this or that situation, I couldn’t shake the feeling that it was partially
because raw pointers were given the old-fashioned, easy-to-type notation.</p>
<p>And so, when I imagined my new C++ syntax, it would have <code>Foo *</code> mean
<code>std::unique_ptr<Foo></code>, and <code>Foo arr[12]</code> mean <code>std::array<Foo, 12></code>.
Why have the not-entirely-deprecated-but-not-preferred legacy C features
be the easier ones to type?</p>
<h2 id="conclusions">Conclusions</h2>
<p>All in all, this shows that a lot of purely syntactic but still substantial
and consequential problems with C++ can be fixed with a syntax reboot, which
Rust mostly provides. And I haven’t once mentioned type safety or memory safety!
This will be developed on further in this blog series, where I will maintain that
Rust is not only a better programming language than C++, but a better <em>unsafe</em>
programming language than C++. Even if I had to use <code>unsafe</code> for every function
in my module, I’d still rather write my module in Rust than C++, for all
these reasons. I say this as a pre-emptive strike against the argument
that occasionally having to use <code>unsafe</code> to achieve performance parity
with C++ (and it is very occasional) “defeats the whole purpose” of Rust.</p>
<p>But of course, there are deeper problems with C++ that Rust also
addresses, beyond just the syntactic. But those will have to wait for
future posts.</p>
A Modern Versionhttps://www.thecodedmessage.com/posts/humpty_dumpty/2020-11-17T00:00:00+00:00Humpty Dumpty sat on a wall
Humpty Dumpty had a great fall
All the king’s horses and all the king’s men
Were too busy deciding who would be king…
To even TRY to put Humpty Dumpty together again.
And he’s just sitting there, all yolk and shell, waiting…<p>Humpty Dumpty sat on a wall<br>
Humpty Dumpty had a great fall<br>
All the king’s horses and all the king’s men<br>
Were too busy deciding who would be king…<br>
To even TRY to put Humpty Dumpty together again.<br>
And he’s just sitting there, all yolk and shell, waiting…</p>
Apple Siliconhttps://www.thecodedmessage.com/posts/apple_silicon/2020-11-16T00:00:00+00:00This year, Apple released, to much fanfare, a somewhat obscure technical change to how its computers work: Macs will transition away from Intel’s CPUs to in-house processors known as “Apple Silicon,” more similar to the technology Apple already uses in its phones and tablets. It is a tremendous amount of hype for something rather technical, and to people used to more user-visible feature announcements, this can be somewhat disappointing, or at least confusing.<p>This year, Apple released, to much fanfare, a somewhat obscure
technical change to how its computers work: Macs will transition
away from Intel’s CPUs to in-house processors known as “Apple
Silicon,” more similar to the technology Apple already
uses in its phones and tablets. It is a tremendous amount of
hype for something rather technical, and to people used to more
user-visible feature announcements, this can be
<a href="https://www.zdnet.com/article/apple-released-a-new-macbook-air-and-im-disheartened/">somewhat disappointing</a>,
or at least confusing.</p>
<p>What does this actually mean for the end user? Apple claims that these
new Macs will be (many times) faster, run cooler, and have much better
battery life. Are these improvements as drastic as Apple claims? Will
there be downsides and other adjustments that users will have to make,
or will these new computers just work like faster, less power-hungry Macs?</p>
<p>A lot of responses I’ve seen seem pretty skeptical, which is
fair. It’s been a long time since the drastic improvements of Moore’s
law have been the norm in computing; we’re used to much more incremental
improvements. And Apple is claiming to achieve these improvements by
moving away from Intel, when Intel is the established market leader in
making high-powered PC processors. Can these new computers really be
that much better?</p>
<p>My answer, as your computer-nerdy friend, is that these computers are
not only going to be a great technical improvement over previous Macs,
but represent a revolution even beyond the Apple ecosystem, a turning
point for PCs in general, and one that was a long time coming. To explain
why, I’m going to delve into some computer history to give context
to this shift, and some of the technical details of how computers work,
and specifically in what ways these new Macs will work differently from
the current ones.</p>
<p>So bear with me as we go deep. I promise it’s relevant.</p>
<h2 id="operating-systems-and-app-compatibility">Operating Systems and App Compatibility</h2>
<p>Do you remember when it mattered a lot more which operating system
you ran?</p>
<p>Nowadays, most of my personal time on the computer is spent on the web
browser, doing my writing on a <a href="https://docs.google.com/">website</a>,
my TV watching on a <a href="https://www.netflix.com/">website</a> and even my
<a href="https://www.rememberthemilk.com">TODO lists</a>. I look at the bottom
of the screen on my non-work computer, the MacBook I’m using to type
this, and I see a slew of icons for various apps: a messenger app, a
mail app, a calendar app, a spreadsheet, a word processor, all in all
standard computer fare, but not getting much use compared to that Google
Chrome icon. As a result, unless we’re doing specialized tasks, like
programming (as I do for work) or CAD or photo-editing, apps besides
the browser is not the <a href="https://xkcd.com/934/">big deal it used to be</a>.</p>
<p>But once, it was a huge deal. Every computer user had several different
apps in their workflow, and using an alternative operating system (as
macOS once was) risked not being able to find appropriate equivalent
apps, and possibly not even being able to read documents that would be
in “Windows formatting.” There was tremendous social pressure to use
the same software as other people, to the extent that this
<a href="https://www.xkcd.com/111/">webcomic</a> rang true.</p>
<p>Therefore, every serious personal computer application existed as
a Windows program, specifically a Windows program that ran on Intel
(or Intel-compatible) processors (known as “Wintel,” especially by
those who criticized it as a monopoly). A company would support other
types of computers only as an after-thought. And at that time, apps were
distributed on CDs and stored on physical media. It was common to have an
old version of an app lying around, not be able to receive live updates
for it, and to expect it to run on a newly-purchased computer, unmodified.</p>
<p>In such an environment, compatibility was key to profits. Each new
Intel processor, and each new version of Windows, had to support all
the apps that could run on the previous version. There was no higher
priority. For years after Windows 95, since before Windows 95 was
even released, Microsoft had another operating system, Windows NT,
that was far more stable and technically superior. But Windows 95
was more similar to Windows 3.1 before it, and supported more apps,
so until Microsoft could get Windows NT to run all those other apps,
it was stuck with the inferior product. Eventually, 6 years later,
Microsoft came out with a version of Windows NT able to run 95 apps:
Windows XP. Even that transition was gnarly.</p>
<p>The Intel side of the “Wintel” monopoly was similar. To this day,
a modern Intel processor is capable of running MS-DOS programs from the
80s directly, without requiring any emulation layer in the operating
system or any modifications to those programs. The antiquated 16-bit
instructions that comprise those programs will still be interpreted by
the modern hardware, which also supports countless other compatibility
modes for various eras of the processor history. And this is true not
only for the Intel processors that power Windows PCs, where it makes some
amount of sense to support all the Microsoft ecosystems of years past,
but also on Macs, where this history is much shallower.</p>
<h2 id="processor-design-and-instruction-set-architecture">Processor Design and Instruction-Set Architecture</h2>
<p>See, Intel is more than just a company. Intel is also an instruction set
architecture, also known as Intel64 (or Intel32 for 32 bit versions) or
x86 (sometimes x64 for 64 bit versions). When applications are prepared
to run on Intel, they are (traditionally) compiled to a file containing a
sequence of instructions. The meaning of each instruction is determined
by complicated standards, and the hardware of the processor must take
these instructions, and actually perform the designated operations.</p>
<p>The amount of complexity involved in the meanings of these instructions,
or the instruction set architecture (ISA) is large. Intel’s ISA was
documented in 3 paperback volumes back when I was a child in the early
aughts when my parents were gracious enough to order the set for me. It
has only grown more gnarly since then.</p>
<p>Because of the vast compatibility requirements that Intel in particular
historically has faced, their ISA, has never been redesigned from scratch
since the 80’s. This means that, instead of having every instruction be
a fixed length of say, 4 bytes, Intel instructions can range from 1 to 15
bytes. Some of them do simple things like adding two numbers together,
whereas others do more complicated things like copying an entire string
of characters. Many of the design choices would never be made by a modern
engineer, but Intel is stuck with them for historical reasons.</p>
<p>But Intel hasn’t been able to clean this up, for the same reason that
“Wintel” customers can’t switch from Intel: Any clean-up would
mean old apps would not be able to run on the new computers. When Intel
did attempt this, with Itanium, the world wasn’t ready, even though
Intel tried to leverage the already-difficult transition to 64-bit as a
reason to get people to switch to a completely new architecture. Even
worse, any clean-up would mean that Intel would have to compete with
other companies as equals, whereas now there is only one other company,
AMD, that is allowed (for historical reasons) to design Intel-ISA
processors. Embarrassingly, it was AMD that actually convinced everyone
to switch to 64-bit, by providing a much more gradual transition to a
64-bit ISA far more similar to the existing 32-bit one.</p>
<p>Processor architectures with such convoluted ISAs are referred to as
CISC, for Complex Instruction Set Computing. All of this complexity must
be implemented using more complex hardware. Intel processors come with
decoders to break down over-complicated instructions into smaller pieces,
reorder buffers to optimize on-the-fly which pieces can be done when,
and extra circuitry to handle all of the various compatibility modes that
are necessary to support old software. All of this constrains processor
design and, very specifically, draws extra power.</p>
<p>What’s the alternative? The transition to mobile, to phones and tablets,
gave computing something of a fresh start. No one ever expected their
“Wintel” apps to run on a phone, and so when initial iPhones and
Androids came out, new apps were written from scratch. Those would be
written for whatever processor architecture Apple and Google chose, and
they chose a more modern, less-CISC ISA: ARM. ARM stands for Advanced
RISC Machine, where RISC, or Reduced Instruction Set Computing, is the
opposite of CISC.</p>
<p>The ARM ISA, which unlike Intel’s is available for any company to
license and design their own compatible processors, takes a moderate
position in the historical RISC/CISC wars, and require in any case far
less decoding circuitry than Intel processors require. When Intel tried
to make Intel-ISA processors for phones and lightweight laptops, the Atom
processors, it was a failure. The decoder got in the way of achieving a
good combination of performance and power consumption, and the resulting
phones were either unacceptably slow or unacceptably low in battery life.</p>
<p>And Apple Silicon is Apple’s branding for their ARM ISA processors,
supporting the ARM ISA with Apple’s proprietary processor
design. They’re bringing the benefits of phones to the PC world.</p>
<h2 id="new-modes-of-app-development">New Modes of App Development</h2>
<p>So the “Wintel” monopoly and Intel in general never jumped from
the PC world to mobile, and as a result we have our cool, fanless,
high battery life but high performance phones we have today. But why,
then, do Macs, which never ran Windows programs, use Intel processors
to begin with? Why are they switching now? And why is this a turning
point for PCs in general?</p>
<p>Well, for one thing, it’s not entirely true that Macs don’t
run Windows programs. A key reason why Apple switched from Power to
Intel in the first place was that Macs can run Windows programs: by
either running a version of Windows simultaneously to running macOS
(<a href="https://www.parallels.com/)">https://www.parallels.com/)</a>, or, more simply, by rebooting the same
computer into Windows, which runs on Mac just as well as on any other
type of PC. When Intel Macs first came out, this was a decisive feature
for many switchers, nervous to abandon app compatibility.</p>
<p>And at the time, Intel was the best processor manufacturer in existence,
so that Intel processors with their flaws were still better (as they were
produced through better manufacturing processes) than the POWER-based
RISC processors Apple was previously using. Get better processors and
get some level of Windows-compatibility: the decision was clear for
Apple at the time.</p>
<p>But now, Windows compatibility is not important to hardly any Mac
users. And there are a number of reasons for that. Nowadays, a lot of
software is not translated to machine code, the level at which ISAs
are relevant. A lot of software is delivered to us via the browser,
where a portion runs on servers in the cloud and a portion is written in
Javascript, and then either interpreted in that form, or translated to
the ISA of the computer live by the browser. Once the browser supports
an ISA, all websites come with it.</p>
<p>And even for apps, many of them are written in higher-level
languages that use a virtual machine or Just-In-Time compilation
to be processor-architecture neutral. These programming language
technologies matured after Intel had already become stuck with its
backwards-compatibility advantage, and apps written with them also are
easily portable, which is to say, brought to a new ISA. Once the Java
virtual machine (for example) is ported to ARM, all Java programs come
with it.</p>
<p>And even for programs that are compiled in a traditional fashion,
written in an old-fashioned compiled programming language like C or C++
(or the Apple-specific Objective-C or Swift), ISA compatibility is no
longer the issue it once was. These languages have evolved over time to
make it easier to re-target ISAs, and programmers nowadays are better
trained in writing their code in such a way that it can be compiled
for any ISA. Creating an ARM version of a Mac app is just a switch in
the compilation system, and maybe finding a few obscure bugs where the
differences matter a little more deeply.</p>
<p>And once the new version is made, Android and iOS both transitioned
from 32-bit to 64-bit ARM, requiring new apps to be built, and we as
customers hardly noticed. Developers quietly prepared 64 bit versions of
our apps, and when we upgraded to 64-bit compatible phones, we didn’t
notice that the app store sent us a different version. After all, we get
updated versions of apps from the app store all the time. As long as the
developer can adjust, the end user just has to do some more downloading –
cheap and easy in an era of widespread broadband.</p>
<h2 id="conclusion">Conclusion</h2>
<p>And so what does Apple lose out on for the Apple Silicon Macs? The ability
to boot Windows is now more of a liability than an asset for them. Old
programs will rapidly be ported. Due to modern technology, which allows
us to translate between ISAs on the spot, emulating Intel on the new Macs
is often faster than running the Intel programs directly on the old Macs.</p>
<p>As users, we might be mildly frustrated by it. I certainly will be
a little worried about buying such a computer until I know that some
version of Linux will work smoothly on it – Linux on ARM is currently
very much so a second-class citizen in the PC Linux world.</p>
<p>But the advantages are great. No longer constrained by hardware decoders,
Macs will cheat the old trade-off between computational power and battery
life, and get to have their cake and eat it too, at least for one round
of abrupt improvement for the transition. And because now the PC and
phones use the same processor architectures, iOS and iPadOS apps will
now work on macOS as well, saving time for writers of tablet applications.</p>
<p>And other PC manufacturers will end up having to notice. Windows on ARM
exists already, though it is currently obscure. If Microsoft can cultivate
as modern an ecosystem as Apple, where a combination of emulation and
streamlined distribution make it easy to get ARM versions, these new
Macs might start an ARMification trend.</p>
<p>This spells (long-term) doom for Intel. Their business model is tied
to their ISA, on the premise that no one can afford to switch away from
the most popular ISA, that everyone is locked in. This was never true for
mobile, in spite of Intel’s best efforts, and as Apple is demonstrating,
it also hasn’t really been true for PCs for a while either.</p>
A Prudent Quarantinehttps://www.thecodedmessage.com/posts/perpetual_quarantime/2020-10-28T00:00:00+00:00Five Members sat in council.
There are some activities, some patterns of human group behavior, that transcend era and culture, and meeting in council is one of them. In spite of the youth of the participants – they were in their late teens and early 20’s – and the informality of the setting – leather couches covered in scratch marks, unfinished walls – they still clearly were sitting in council. The seriousness with which they were watching the video, their intentional and controlled posturing and nuanced glances, would have been instantly recognizable to any Parliament or Diet throughout history.<p>Five Members sat in council.</p>
<p>There are some activities, some patterns of human group behavior,
that transcend era and culture, and meeting in council is
one of them. In spite of the youth of the participants –
they were in their late teens and early 20’s – and the
informality of the setting – leather couches covered in scratch
marks, unfinished walls – they still clearly were sitting in
council. The seriousness with which they were watching the video,
their intentional and controlled posturing and nuanced glances,
would have been instantly recognizable to any Parliament or
Diet throughout history. They had met to do business, to make
a decision, to come to a consensus.</p>
<p>One Member rose to speak. “I don’t know about this girl,”
said Carlos, tapping his phone.</p>
<p>Abruptly, the video paused. They now saw the girl, Petra, frozen,
mid-sentence, mid-gesture, looming over them, larger than life
on the 12 foot tall projector screen. She didn’t look like
someone who would require scrutiny from anyone, let alone detailed
conciliar deliberation. She was sitting on a small uphill slope
of rolling grass, relaxed but still composed. She just looked
well put-together: she was meticulously dressed; perfect subtle
make-up; cheery, friendly demeanor. She projected a presence
greater than her small stature: though on physicality alone she
could be mistaken for a young teenager (she was, in fact, 21),
her presentation was that of a 30 year old.</p>
<p>Christy resisted the urge to lower her head, instead presenting
a stoicism she did not feel. She fantasized about a world where
she could introduce the rest of Red Stripe Quarantine Group
(Reg. No. F56D3, Bellevue, WA) to her girlfriend in a pleasant,
friendly manner, like people do in video chats and VR hang-outs,
except in real life. Failing that, Christy really wanted to
fast-forward through this whole interview, not just the video
itself, but also fast-forwarding or just skipping the lived
experience of her closest friends scrutinizing and discussing
the love of her life. But she knew that this was the only way she
could ever see Petra in person. The other Members were entitled
to their safety. It was, after all, the law. <em>A law followed is a life saved.</em></p>
<p>But did it have to be Carlos who first spoke against
Petra? Don’t misunderstand: Christy was duly grateful to
Carlos. He had truly enriched her life, in the most concrete
way possible, not with spiritual or social riches, but with hard
American greenbacks (still called that even though cash had long
since been outlawed as a disease vector). If he hadn’t joined
nine months prior, if he hadn’t started, coyly and then loudly,
saying “what if” – “What if we used our still to make
hand sanitizer in addition to booze?” “What if we started
growing aloe and selling the hand sanitizer?” “What if we
started taking shifts, contacting distribution companies, and
getting sold in local drug stores?” – she might have had to
drop out of school, and Red Stripe, her chosen family, might have
had to disband. And it was admittedly quite satisfying to see
the “Red Stripe Hand Sanitizer™” labels on their products
at the local corner stores when they went on their supply runs.</p>
<p>But, in practice, even though he was not Leader, that made
Carlos her boss. And no one wants their boss interviewing or
scrutinizing their girlfriend. Christy was grateful that Carlos
had not succeeded in convincing Joe to let him shadow – or,
really, co-conduct – the interview. Christy wished there had
been a way to exclude Carlos from this stage as well. But there
he was, dressed in his button up and slacks, even though it
was just them, even though everyone else was wearing pyjamas,
as if at any second he would find himself thrust into a video
call with a supplier or distributor.</p>
<p>Lilith, Christy’s cat, sensed Christy’s anxiety, and came
running over as soon as Carlos stood up. Christy picked Lilith
up, put her on her lap, and sighed deeply. At the same time,
she heard Joe sigh a more abrupt, exasperated sigh. “Huph.”</p>
<p>Christy recognized that sigh. She hoped that it meant that she
would not be called upon to defend Petra, that Joe would take
up her cause for her.</p>
<p>Joe was the official Leader of the Quarantine Group, and the
original founder. Christy and the other Members – even Carlos,
to some extent – looked up to him as an elder brother. Joe had
been the one to actually conduct the interview that they were all
now evaluating, and now he was ready to jump in on the discussion:
“What do you mean, you don’t know about her? What’s your
concern, specifically?”</p>
<p>Carlos lifted his hand in the exact way that he did before he
was going to make a new, strict, unwelcome rule for the factory
(the latest one eliminated water bottles in the room with the
still). Christy suddenly realized she couldn’t bear to listen
to what she felt certain was coming next, and so she had to
get out ahead of it. “It’s not because she’s Chinese,
is it? You do know she’s Chinese-American, right?”</p>
<p>As soon as she said this, Christy covered her mouth up. Of
course Carlos knew that, and Christy had basically just called
him stupid, and bigoted to boot. But it was a real concern,
and one that she’d been worried about. After all, everyone
knew the stories, both from grandparents who remembered it all
and from history class in school: 53 years ago, in the first
year of the Perpetual Quarantine (Q.Y. 1 for Quarantine Year
1), the First Virus started spreading from China to the other
countries, because, as was invariably pointed out, China was
the least hygienic, least responsible country.</p>
<p>Carlos put his hand back down and slowly turned towards her,
adjusting his glasses, a gesture he had learned from his
parents. “Of course not,” said Carlos earnestly. “The
Administration would never allow any foreigners into this
country, especially ones from such a place as China.” Everyone
nodded. Everyone had also learned in school that immigration
to the United States had been outlawed since Q.Y. 4, 49 years
ago. “I am not one to hold the ancestors’ flaws against
the descendants.”</p>
<p>Abruptly, a woman’s voice boomed above them, way too loud for
comfort. “So what is the issue, then?” Carlos’s partner
Jillian jumped, having forgotten, as was all too easy to do
sometimes, about the sixth Member of their deliberations, Leila,
who had videoed in for the meeting, her face visible on a small
(only 3 foot by 3 foot) square on the projector screen, and her
voice broadcast, apparently at the wrong volume, through their
room’s sound system.</p>
<p>Leila was an out of house member of Red Stripe Quarantine Group;
she was in the same “abstract” or “virtual” household. She
had her own separate apartment, but her red wrist bracelet was
the same as the others’, and it also allowed her to visit
the Red Stripe Hub house whenever she wanted – which was
legally allowed as long as it was the only other household she
visited. Quarantine groups were currently allowed to have up to
10 adults and up to 4 locations, and Red Stripe’s model of a
central hub with one out of house Member was not uncommon.</p>
<p>Leila needed this flexibility to study to be a doctor. Every
day from Monday through Thursday, she went from her apartment
to a physical lab on campus. At her designated time each day –
which was assigned by lottery and could range from a pleasant
11AM to a torturous 3AM slot – she would walk over to the lab,
fully masked. Once the previous student was confirmed as being
out of the mini-lab she had been assigned, she would take a
quick Sanitation Shower, do her lab work in her lab clothes
(perhaps on a video chat with other students or an instructor),
and then leave after 45 minutes. Next year, she would qualify
for a full hour and a half, which she was looking forward to.</p>
<p>Then, Thursday night through the weekend, she would usually
stay with the rest of Red Stripe in the main house. For this,
she would book an autonomous car, which would then automatically
take her to the main house as the only place she was legally
allowed to go. As soon as she arrived, she would shower and stay
in her pyjamas until it was time for her to go back on Monday.</p>
<p>And this particular meeting took place on a Tuesday, which is
why instead of sitting on her currently-empty spot on the middle
couch, Leila was sitting cross-legged on a brightly-colored
patterned ottoman in her own little corner of the projection
screen. “Petra is not ‘this girl,’ she’s Christy’s
girlfriend. If you have a real concern, that’s fine, we can
discuss it, but like, I really thought this was just going to
be a formality. Don’t we all just want Christy to be happy?”</p>
<p>Christy felt like she would melt. Of course, she knew Leila
would stand up for Petra, as Christy’s best friend and the
only Member who had really had significant interaction with
Petra before this. But actually hearing Leila jump in so boldly,
so confidently just filled her with gratitude. She raised her
hand up towards the camera in the shape of a heart, and smiled.</p>
<p>Carlos stretched his back out, and put his hands behind his back,
and began to walk, and then to pace. This was how sermons were
preached in the Watchers of the Vaccine, the niche religious group
Carlos was raised in (Carlos wouldn’t let the others call it a
cult, even though he no longer practiced). And, as with the clergy
of that group, it meant that Carlos was about to give a sermon.</p>
<p>“She did the interview outside, in a park. She wasn’t even
wearing a mask. Now I know, and she did make it very clear,
that it was her group’s Liberty Day.” Every group had one
day a week they were allowed to go for walks or to the park,
indicated by the color of their bracelets. “And I also know
there weren’t other people around, so she wasn’t violating
the masks law. That is not my point.”</p>
<p>“Well, friend,” began Kevin, and then he paused. Everyone
waited patiently. Kevin was Joe’s partner and led the
group’s semi-weekly yoga classes, and he talked in the slow,
overwrought way that people talk when they’ve smoked weed their
entire lives. He waved his hand slowly as his brain loaded more
words. “What is your point then?” More loading. “It’s
more important to be safe than look safe, my friend. And she
seemed really chill.”</p>
<p>At the word “chill,” Carlos reared back as if he was ready
to charge. He pointed his finger with the energy of a punch, his
face bulging red and about to boil over from anger. He inhaled
deeply a few times, and when his face had finally returned to
a more normal color, he spoke, with the sort of crisp calmness
that in certain personalities can mask a seething rage. “I
just don’t think that’s true.”</p>
<p>Kevin blinked. “What’s not true?”</p>
<p>“Part of being safe is looking safe. <em>Care and focus defeat the virus.</em> We’re not looking for ways to technically follow
the law. We’re not looking to get a C+ in quarantine. We’re
not going to be like the videos of Canada.” Canada, in spite
of several attempted military interventions, still refused
to quarantine to American standards. <em>Protect your country, observe the quarantine.</em> Carlos paused for a breath, and then
continued. “After all, we make hand sanitizers. We must
be above reproach.” Everyone nodded their half-hearted
agreement. ‘Above reproach’ was one of Carlos’s
catch-phrases. “And Petra… Well, she didn’t say anything
about what personal disciplines she kept to avoid the spread
of infection. She didn’t say how she would help contribute to
our economy. She didn’t –”</p>
<p>Christy’s panic and helplessness slowly converted to
anger. Would she ever be free of Carlos’s suffocating
perspectives? It used to just be during their work shifts, but
now not one part of her life was safe. Something would need to
be done, she thought, in a boil gradually bubbling beneath the
lid of her anxiety.</p>
<p>“Our economy?” Joe interrupted. “We have six people! And
she said she had an online job, that she would pay her share
of the rent. And that she was a good cook. I didn’t ask her
for more details, so maybe she has more ideas she just didn’t
mention. I think –”</p>
<p>“Yes,” Carlos said, his voice now ice-cold. “She said she
wanted to share with us new, fun recipes. Recipes that, for all
we know, involve endangering us all by going to multiple grocery
stores in a single trip. Recipes with expensive ingredients –”</p>
<p>Christy’s anger finally burst through the lid, and she yelled,
“Carlos!”</p>
<p>Joe and Carlos froze mid-argument, their gestures stuck like a
video on pause, and the rest of the council gaped at Christy. In
all the time they had known her, she had never raised her
voice in anger before. In excitement, perhaps, or enthusiasm,
but never in anger.</p>
<p>And indeed, in the light of their stares, Christy recovered her
normal affect, and smiled, speaking as sweetly as she could,
but with a hint of bitterness that her fellow Members weren’t
used to. She decided to focus on one particular point. “We all
are very duly grateful that you got us into the hand sanitizing
business. But that means we should be less worried about money
now, not more, right?”</p>
<p>Kevin agreed. “Wouldn’t you enjoy some” … wave, wave, words
loading, inhale … “more new tastes?”</p>
<p>Carlos shook his head, as if only he understood the true severity
of the situation. “That’s not the only example.” And at
that, Carlos started tapping at his phone in his hand, jerking
the video back in 15 second jumps. The out of context freeze
frames, alternating between Petra’s elegant, nuanced body
language and Joe’s extravagant gestures, formed a bizarre dance.</p>
<p>The others held in their intense emotions in patient but active
silence. Christy clenched her fists in an effort to keep herself from
crying or yelling – she wasn’t sure which.</p>
<p>Finally, Carlos seemed to have found the correct spot in the video. Joe
had just asked “What is an ethical dilemma that you have faced?”
Petra had said, “So, this one time I was at this party, and this girl
was drunk…”</p>
<p>Carlos paused the video again, and looked out over the group. “Was I
the only one who heard that?”</p>
<p>Everyone looked at each other, slightly uncomfortable. Finally Leila
spoke up, saying what everyone else had been thinking. “She clearly
means like a video chat or VR party.” Leila paused, and then continued,
audibly affronted. “You’re not actually suggesting –”</p>
<p>“Christy,” said Carlos. “Has she ever taken you to a VR or video
chat party?”</p>
<p>Christy paused. Petra hadn’t.</p>
<p>Well, that wasn’t entirely true. She had, but mostly with Petra’s own
family or one of her close friends from her family’s group. Certainly
not the “meet a lot of people” type of party, not something that
you would call a party – loaded and edgy as that word was – even
metaphorically.</p>
<p>So the question still stood. Did Petra have a past? Did Petra used to
be a partier?</p>
<p>Christy would have to discuss this with Petra. But, she told herself
decisively, it didn’t matter. The Petra she knew would never do
something so irresponsible, and the past was in the past. Perhaps it had
been a real party, and Petra was being forthright in bringing it up. She
had also promised not to violate the group’s safety. If she really
was being irresponsible, still, today, would she speak so openly about it?</p>
<p>In the midst of these thoughts, Christy remembered that she still had
to speak, even though right then she’d rather not exist. She took a
deep breath, hoping to inhale some courage along with the air. “I’m
honestly quite hurt that we’ve come this far. Joe and Kevin, you have
each other. Carlos, you have Jillian.” Jillian smiled coolly and waved
from her seat. “I have been dating this girl online for three months
now, and you know I have never met her in person. And now she’s giving
up her whole life, her ability to see her parents every day, to talk
to them in person, to live with me. I thought I lived in a place that
wanted me to have a partner, to be happy!”</p>
<p>Kevin, now wearing an overwrought but sincere frown, sympathetically
patted Christy on the back. “Come now. Carlos is just trying to do
due diligence. He’s not actually going to veto her. We’re just maybe
going to have to do another round of interviews. I know it’s annoying,
but we’ll get her here, I promise.”</p>
<p>Carlos looked sternly at Kevin. “Don’t trivialize this! Everyone has
a right to be comfortable with the people they’re quarantined with,
the people we spend all our time with, the people who we trust not to
get us sick. If I don’t feel comfortable with her –”</p>
<p>Joe interrupted. “Carlos, if I’d known you were looking
for more detail about how she’d contribute economically, I
would have asked her more relevant questions. And I’m sure
Leila’s right about the party. Perhaps we should just do
another round –”</p>
<p>“She could have brought it up!” Carlos said, measuredly
but twice as loud as Joe. “She could’ve explained when she
realized what it sounded like she was saying! Our goal isn’t
to bring her on board and lie to ourselves that she’s a good
fit. If I were Leader of this group, performing these interviews,
I wouldn’t prompt her to say the words to check a box, that’s
not how this works.”</p>
<p>Kevin gasped in slow motion. Even Jillian blinked. Everyone
had always been able to hear Carlos thinking about how much he
wanted to be Leader; it was an extremely loud thought. But this
was the closest Carlos had ever come to actually saying it. And
certainly the other Members could see the logic, and no one could
argue that he hadn’t earned it, or didn’t deserve it, on
some level. But on a level no one was really able to articulate,
no one really supported the idea either, and up to this moment,
Carlos had simply been biding his time as it gradually seemed
more and more inevitable.</p>
<p>But no one said anything, so Carlos continued. “And I think I
miscommunicated earlier. It’s not just what Petra did or didn’t say
that bothers me. Maybe she just used to go to a lot of VR parties. That
probably is what she meant. But that’s not what is really bugging
me. That, we could do follow-up research on. That, I’d do another
interview for.</p>
<p>“But her personality test – her personality test showed that she’s
an extreme extrovert. I know those types! They stretch every law and
regulation to go outside and meet strangers as much as possible. They
go grocery shopping just to talk to other people. They are responsible
for so much contagion.”</p>
<p>Christy hadn’t reviewed the personality tests and so she had never
learned this, but she didn’t think much of it. She could see how
someone could confuse Petra for an extreme extrovert; perhaps this was
a testing glitch.</p>
<p>“Hey, I tested as an extreme extrovert!” said Joe. “Are
you going to accuse me of endangering all of us?”</p>
<p>Too much was happening for Christy. Her girlfriend was potentially
vetoed; Carlos was making an active bid for the Leadership; and
now Joe, her friend, or rather, the older brother her birth family
lacked, had actually been an extrovert this whole time? Along
with Petra? The test was thoroughly broken. Or maybe the word
just didn’t mean what she thought it did. In any case, nothing
Petra had in common with Joe could possibly be held against her,
no matter how bad it sounded.</p>
<p>Carlos echoed Christy’s thoughts. “I’m not saying the test
is 100% –”</p>
<p>“But I am an extrovert. Being an extrovert is not a bad
thing!”</p>
<p>A few seconds of stunned silence passed.</p>
<p>Finally, Leila’s voice cut through from the
speaker. “Carlos! What on earth is wrong with you? Why are
you putting Christy through this?”</p>
<p>Carlos looked towards the speaker, then the screen, a little
flustered. He started to say, “I…” and then paused.</p>
<p>Leila seemed to take this pause as a moral concession. “You
want to veto Petra, and make an ass out of yourself as you do
it? Fine. I can’t stop you. But I can leave this group. Christy
and Petra and I can start a new one. I have my own apartment
already, and I have savings, you know. You think you’re so
special and so important just because you started a business
and –”</p>
<p>“I will not,” yelled Joe, “I will not see this group,
that I spent years bringing together, fall apart like this! We
are a family, people. And Christy does deserve happiness. And
Carlos, Carlos does deserve recognition for his activities;
we’d all be bankrupt if it weren’t for him. Leila, you only
have savings because of him. You know that.”</p>
<p>Christy knew that Leila didn’t see it that way. After all,
as Leila had told her many times, though never in Carlos’s
hearing, they all contributed to the business. Carlos just
contributed differently. But Christy saw what Joe was trying to
do, and waited quietly, realizing it might be her only chance
at Petra moving in, and thus at happiness.</p>
<p>Joe took a moment to catch his breath, and then continued. “So I’d
like to propose a compromise. We vote Petra in, and we vote, in the same
vote, to make Carlos the official Deputy Leader. Next time, Carlos,”
he said, looking over at him, “you and I design the interview process
together, and we work to build a fairer process. We also try extra hard,
in the future, to make a policy that favors romantic partners of existing
members, and has clear criteria for acceptance. Does that sound good to
you, Carlos?”</p>
<p>Carlos paused for a moment, and then nodded. “Yes, that sounds
perfect.” And everyone could hear the simultaneous thought: “Only
Deputy Leader?”</p>
<p>Christy felt her chest relax as all the tension escaped, creating an
embarrassingly audible, vocalized sigh.</p>
<p>“Alright,” said Joe, “everyone in favor of making Carlos Deputy
Leader and taking Petra on in our group, please say ‘aye.’”</p>
<p>“Aye,” said everyone in the group.</p>
<p>“And any opposed, say no?” Joe continued. No one said anything.</p>
<p>“Well,” Joe continued, “the ayes have it. Congratulations,
Christy! I look forward to meeting Petra in real life.”</p>
A Respectable Octopedianhttps://www.thecodedmessage.com/posts/octopedian/2020-05-14T00:00:00+00:00In front of Penny in line was a 7 foot tall humanoid with glowing blue skin. She suppressed the urge to ask what species they were, and let the alien order their vegan breakfast burrito. The barista at United Planets’ first-floor Starbucks looked human except for the extra hands. Polycherian, Penny remembered. When the barista handed Penny her order – an egg and cheese sandwich on a bagel – Penny bowed respectfully and said pflintsu – Polycherian for “thank you” – before getting on the elevator.<p>In front of Penny in line was a 7 foot tall humanoid with glowing
blue skin. She suppressed the urge to ask what species they were,
and let the alien order their vegan breakfast burrito. The barista at
United Planets’ first-floor Starbucks looked human except for the
extra hands. Polycherian, Penny remembered. When the barista handed
Penny her order – an egg and cheese sandwich on a bagel – Penny bowed
respectfully and said <em>pflintsu</em> – Polycherian for “thank you” –
before getting on the elevator.</p>
<p>Penny loved working at United Planets’ New York Headquarters for
the same reason she moved to New York City in the first place: the
diversity. No other job, no other workplace, could ever measure up –
not on Earth, anyway. As such, she smiled when the elevator door opened
and a three-foot tall octopus-looking alien oozed in.</p>
<p>But her smile rapidly faded when she recognized this particular
Octopedian. It was the Octopedian Representative himself, Estramsor,
Deputy Leader of the Traditionalist Faction in the Interstellar Congress,
who had recently argued in front of the Congress to embargo
humanity, to quarantine them, to prevent the evils inherent in human
nature from spreading throughout the galaxy.</p>
<p>And so it didn’t surprise her that the Octopedian did not return her
greeting, but stared straight ahead. Penny shifted her weight back and
forth. If the Octopedian had succeeded, it would have ruined Penny’s
dream of travelling the universe, visiting other planets and learning
everything she could about alien cultures. Besides, it would cruelly
bottle humanity in with all of its flaws, never to grow or mature.</p>
<p>The elevator was taking a long time to reach the next floor, though. And
ultimately, it lurched to a stop.</p>
<p>“It’s stuck again,” said Estramsor, matter-of-factly, continuing
to stare straight ahead. “Typical human technology.”</p>
<p>Penny read the inspection certificate, which stated that this elevator
had been manufactured on Alpha Centauri, but didn’t say anything.</p>
<p>Estramsor looked over at Penny and started writhing his front tentacles
in a gesture that even Penny, who only knew a few Octopedians, recognized
as open disgust. He strutted into a corner and looked away.</p>
<p>Finally Penny felt compelled to speak up. “Even if you don’t like
humans, you could still treat us with common courtesy.”</p>
<p>“Courtesy?” asked Estramsor. “Consider it a courtesy that I didn’t
cover you in acidic slime for such an impudent statement.”</p>
<p>“Hey!” said Penny. “You’re on our planet! We are the ones who gave
you the courtesy of letting you work here and giving you this space.”</p>
<p>“Well,” he said, spitting up some black goo. “What a lot of good
that did. But this is my last day, I am dropping your disgusting planet
out of my portfolio. So I shan’t mind offending any humans on the way
out. I do have diplomatic immunity, so watch your mouth.”</p>
<p>Penny pressed the button for the next floor, then the open button, the
close button, and the call button harder and harder. When none of it
worked, she sighed and sat on the floor, waiting for something to happen.</p>
<p>After a while, though, she found she couldn’t stay silent
anymore. “You know what I don’t get?” She looked over at the
Octopedian, who did not react or move in any way. “All those things
you mentioned in your speech: War, and the glorification of war. Poverty
and starvation. Our inability to deploy medical resources to those who
actually need them. It was a pretty damning speech. I remember it. I
found it moving. I really felt ashamed to be a human.”</p>
<p>“Thank you for the compliment. I assure you, the speech was meant for
Congress, not for you.”</p>
<p>“But I’ve learned a lot since the embargo was voted down. For every
single thing you mentioned, every single flaw, humans are not
alone. There are a dozen other cultures and worlds, fully
represented in the Congress, that do those things.”</p>
<p>“Yes, that is what my opponents successfully argued. Do you have
something intelligent to say, human?”</p>
<p>Penny thought for a moment. She knew she’d never have an opportunity to
ask this question again, even if she did get acid burns for it. “So you
must have known that the embargo wasn’t actually going to pass. What’s
the real reason? What do you actually have against humans?”</p>
<p>The Octopedian’s translation device made a sighing sound, uncanny coming
from a face with a closed mouth, with no clear mechanism to make such
a mammalian noise. “You are, as someone who works in such an august
institution as this, surely aware that even non-obligate carnivorism,
while frowned upon by enlightened species, is fully allowed under our
legal system, certainly not a valid argument for an embargo. I am old
enough to remember when that wasn’t established law, and I tried my
hardest to make that decision go the other way.”</p>
<p>Penny nodded. Most intelligent omnivorous species were vegetarian;
that was Interstellar Multiculturalism 101.</p>
<p>“But then I found out that you ate eggs,” he said, and he lifted
up a small device and abruptly projected, onto the elevator wall,
a picture of an Earth octopus egg being harvested. Penny jumped. The
picture changed to a factory farm full of chickens, eggs collecting
beneath them. “That was additionally shocking. Even so, I thought I
could tolerate it, until I saw this.”</p>
<p>And there it was, an Octopedian eating an out-of-places,
Earth-style egg bagel sandwich.</p>
<p>The Octopedian looked at the picture, and all 8 legs shuddered. “Some
of our own religious authorities" – Penny remembered that the Octopedian
homeworld was a theocracy – “concluded that eggs of other species
were allowable food, even for us. I couldn’t argue this in front of
the Congress, but this heresy had to be cut off at the root.”</p>
<p>The elevator lurched and stopped. The door opened and the Octopedian left,
as Penny looked at the egg and cheese bagel sandwich still in her hand.</p>
All Rent Should Be Cancelledhttps://www.thecodedmessage.com/posts/rent_pause/2020-03-23T00:00:00+00:00Even early last week, before restaurants were closed, before we were banned from unnecessary gatherings, when many people still had to go into their office jobs, the bars were empty on my street. I walked into one, ordered a cocktail, asked the bartender why it was so slow. It was usually slow on Tuesdays, of course, but normally there was at least one other customer. But the pandemic had already scared everyone else away, and if it continued, the bar would surely have to close.<p>Even early last week, before restaurants were closed, before we were
banned from unnecessary gatherings, when many people still had to go
into their office jobs, the bars were empty on my street. I walked
into one, ordered a cocktail, asked the bartender why it was so slow.
It was usually slow on Tuesdays, of course, but normally there was
at least one other customer. But the pandemic had already scared
everyone else away, and if it continued, the bar would surely have
to close.</p>
<p>Then, the bartender casually mentioned that, if they had to close for
a month, that would be the end of the business, because how would they
pay rent? This hadn’t occurred to me, but given the steep prices of
commercial real estate, it made sense. I began to fill with dread as
I imagined this bar that I loved closing not for the duration of the
pandemic but forever, and not just this bar, but any bar owned by local
businesspeople, any bar that didn’t have the vast resources needed to
just pay a few extra months of rent with no customers.</p>
<p>And now, all the restaurants are closed. Sure, many of them are making
money doing take-out, but some of them are primarily bars. Most of
them were selling an ambience as well as food, and none of them planned
on a takeout only business. I am worried that, when all this is over,
half of restaurants will be shuttered for lack of ability to pay their
leases – or from falling behind on other bills because all their money
went towards their leases.</p>
<p>Now, if landlords all behaved reasonably, I don’t think this would be
a problem. Normally, if a business fails to pay rent, that means it
should be replaced by a business that can. It’s not just a matter of
one month of lost income for the landlord; it’s also a signal that
more bad months are likely coming.</p>
<p>But in this situation, the signal would be all wrong. Whether a business
does well or not in a “shelter in place” economy has nothing to do with
whether it will do well when the restrictions are lifted. If a landlord
shut down a bar because it can’t pay rent now, what reason would they
have to believe that the business that would replace it would do any
better when things are back to normal? How long would they have to find
a new tenant? It’s in the landlord’s best interest to be understanding.</p>
<p>But unfortunately, I don’t think landlords will be as reasonable as they
could. Sure, many will be forced to let businesses miss a few months
of rent. But some of them will demand back payments, just because their
contracts say they can, and while some businesses might be able to handle
that, some will still close. Others might evict businesses that annoy
them, and use this as a legal excuse, and other landlords might just
not understand or decide to milk their tenants dry, even though it’s
ultimately not in their own interest.</p>
<p>So, in everyone’s best interest, we should cancel rent. The argument
I just made for businesses works for individuals as well – whether you
can pay rent in the next few months has basically nothing to do with
whether you normally can.</p>
<p>We should cancel rent, and not postpone it. Everyone can start doing it
again when the shelter in place order is lifted, and the money starts
flowing again, but only for subsequent months – no one should have to
pay extra to catch up for the months we missed.</p>
<p>After all, are you getting your full use out of your housing during
this month? Are businesses getting full use out of their land? I
live in NYC because of my job – but now I’m not going to it. Businesses
locate where they do to get customers – but now the customers aren’t
allowed to come.</p>
<p>We should cancel rent, not just evictions. Because with a moritorium on
evictions, landlords can still demand catch up rent once that moritorium
is lifted. And many people will pay rent now when they actually can’t
afford to.</p>
<p>New York City has loans to small businesses, but that’s not enough. That
means they will still have to pay those loans back. Some businesses won’t
be able to afford to, and not because they were bad businesses, but just
because of the pandemic.</p>
<p>Unemployment benefits aren’t enough. The website is broken, and it
requires a bunch of bureaucracy. It also doesn’t cover a lot of
people who were self-employed, or under the table. And unemployment
benefits for business owners isn’t enough to cover their business
bills.</p>
<p>And who knows if Congress will ever figure out basic income?</p>
<p>Cancelling rent, on the other hand, won’t require any paperwork besides
the original order. You just announce that rent is cancelled. Everyone,
instead of paying, doesn’t pay. No website, no bureaucracy, no worrying
about whether you’re claim is approved or not, or whether you meet
the arcane requirements for the program.</p>
<p>Such extreme times require extreme measures, not just in preventing
the spread of the virus, but in preventing economic devastation. I
not only want to save the lives of every New Yorker, but when this
social distancing is lifted, I want to be able to walk through the
city, and see it full of all the restaurants, bars, and shops that
were here before, every last one of them.</p>
<p>We’ve already put the state of New York on pause, including the flows of
money. Money flows in cycles, and rent is normally one part of a fully
functioning cycle. If we pause the rest of the cycle, we have to pause
rent too.</p>
<p>With social distancing, we’ve cancelled going out. We’ve cancelled
fun. We’ve cancelled millions of jobs. We’ve cancelled huge swaths
of the economy. We should cancel rent.</p>
Open Internet, Closed Webhttps://www.thecodedmessage.com/posts/web/2019-12-23T00:00:00+00:00The Internet promised — and still promises — a revolution in democratic, decentralized, and open communications. And yet, we see today a tech world controlled by a few central players, as Elizabeth Warren promises to break them up and Congress summons Mark Zuckerberg to explain his company’s role in privacy-violating election-manipulating foreign conspiracies. But Presidential use of anti-trust laws and new Congressional regulations of social media won’t address the more fundamental issues: The Internet is now structured, on a technical and social level, so as to naturally encourage centralized monopolies.<p>The Internet promised — and still promises — a revolution in
democratic, decentralized, and open communications. And yet,
we see today a tech world controlled by a few central players,
as Elizabeth Warren promises to break them up and Congress summons
Mark Zuckerberg to explain his company’s role in privacy-violating
election-manipulating foreign conspiracies. But Presidential use of
anti-trust laws and new Congressional regulations of social media
won’t address the more fundamental issues: The Internet is now
structured, on a technical and social level, so as to naturally
encourage centralized monopolies.</p>
<p>To explain this, we’ll first have to explain some terms. In common
parlance, the terms Web and Internet are used interchangeably, but
technically they refer to different elements of what now looks like
a single system. The Internet refers to the single global connected
network, and technologies that allow any computer on it to connect
to any other computer on it — but without saying much about what
the connection looks like. The Web is but one way of communicating
information over the Internet, where you use a browser to access
“websites,” but other ones exist: for example online video games
don’t generally use the web to sync data between players. Examples
are easier to find as we go back in time: the stand-alone AOL instant
messenger app did not use the web, and neither did old-fashioned
e-mail clients like Outlook or Thunderbird, or Bittorrent and other
torrent trackers.</p>
<p>What makes the web different, that it has eaten up these other
services, that now we do our movie-watching, our chatting, and our
e-mailing in the web browser?</p>
<p>The web started out as a way of posting content — you would enter your
URL, which identified what server (or publically accessible computer)
you wanted a webpage from, and what page you wanted. The browser would
send a request to the server, and it would send you back the page at that
URL, likely either an article, or a directory of articles. They would
have text and possibly embedded images, and could link to each other,
and specify another URL to go to. The original concept of the web would
have included sites like magazines, and envisioned sites like Wikipedia,
but would not have been able to support e-mail or a chat app or a social
media platform like Facebook.</p>
<p>The web was just the “public content” protocol alongside other
protocols, and similar to them. You could choose your own browser,
Netscape or Internet Explorer, and access the same web pages, just like
you could choose your own e-mail client, Outlook Express or Eudora,
to access the same feed of e-mail. The software was installed on your
computer, and what you accessed through it was content, and that content
all was for you to read.</p>
<p>Gradually, however, this changed as the web became more flexible. CGI
allowed forms on websites to connect to programs that would be run
in response on the server. Java and Flash and ActiveX allowed you to
embed programs in your website — programs that you would not download
and run on their own, but that came with the page and acted as if they
were part of the page. And gradually, Javascript, originally used to
validate forms before they were submitted, or to do simple animations,
became powerful, as browser vendors competed to make it run fast, and
as it gained more capabilities.</p>
<p>When you go to Facebook, you are not reading a page that someone posted
there; you’re not accessing “content” in the traditional sense. What
you are doing is downloading, on the spot, a large application. Not
only is the content sent over the wire — the statuses, the comments,
the pictures, the lists of people who like it — but, inseparably from
it, we are sent the software that is used to process the content, the
application used to enter it and generate it, sent to run in the browser
every time we type in “www.facebook.com”. It is only through the
lens of that Javascript program that we can access the content itself.</p>
<p>Indeed, every time we go to a modern website, especially one by a major
tech company, we load a fresh program into our browsers. No longer are
browsers just renderers of pages stored on servers, they are platforms
where programs run, where the programs are written not for Windows or
Mac or Linux, but for the web browser, now typically for Google Chrome,
which has become an operating system unto itself.</p>
<p>Why does this lend itself to monopolization and privacy problems? For
one thing, the web lends itself to an integration of frontend or client
code, which runs on your computer, and backend code, which runs on a
server. With a non-web protocol, you can use many programs to access
the content on a specific server: different e-mail clients for the same
provider, different trackers for the same torrent. You can also combine
multiple e-mail providers or torrents in a single window. With the web,
you go to the server, and you are provided with the client program to
access the services it provides. You can’t take the Facebook Javascript
code and point it at Twitter, nor can you expect your own custom Facebook
app to work.</p>
<p>Imagine how a social network like Facebook might work if it were conceived
of outside of the web. There might be a standardized protocol (say SSP
for Standard Social Protocol) and multiple packages of client and server
software. A school or a church or another community stakeholder might
run their own copy of the server software, and you might have accounts
on multiple such servers. All the status updates could be aggregated
together in a single feed, and you could configure settings to indicate
which servers your posts went to. Perhaps you could have “friends”
at a server you don’t subscribe to, and specify both their username
and what server they use (with an at-sign, like <a href="mailto:eric.smith@cornell.edu">eric.smith@cornell.edu</a>),
and the servers could sync with each other so that you could still see
their posts.</p>
<p>Who would pay for all this software be written? The software would
be sold to you like Outlook was, or perhaps open source packages like
Thunderbird (Mozilla’s e-mail client) would arise. And who would pay
for the servers? Your school, workplace, ISP, or community, and probably
you could sign up for a public ad-supported or for-fee service.</p>
<p>And in this model, if you control your client social software, you could
have any strategy for what statuses it shows you and what doesn’t,
rather than Facebook’s algorithm deliberately designed to addict
you. You would be able to pay for the service rather than be thrown into
a huge advertising pool.</p>
<p>It’s also fundamentally less monopolistic. You could imagine that
someone, instead of using a standardized protocol, released a single
client and sold the server software. Other companies or open source
communities would soon make compatible software, and since the network
of interactions was already decentralized, using those compatible systems
would not prevent you from interacting in the same community, as happens
to alternatives to current social networks.</p>
<p>Of course, they could also try harder, and force you to use their
server, and release a single client, like AOL Instant Messenger
did. But then programs like Pidgin came to aggregate that and other
messenger clients, so that you could talk to contacts on different
messenging systems in the same app.</p>
<p>This type of social network, which is known as a federated social
network isn’t an unachievable dream. E-mail used to work this way
before GMail gobbled it up, and still does theoretically: that’s
why there’s an @-sign in e-mail addresses, to indicate which of
many compatible servers you have an account at. Social media used
to work like this, too: You could be a member of many listservs
or newsgroups, and it would be handled through a single e-mail and
newsreader app. Messaging doesn’t work this way, even today, but
there is a protocol out there that would work like that, called XMPP:
It simply never caught on.</p>
<p>There even exists software, like Mastodon, and a protocol like our
hypothetical SSP, called ActivityPub, that does exactly what I just
described. But Facebook, Twitter, Reddit and similar sites have stolen all
the actual user-base. A social network, of all things, needs a certain
critical mass before anyone can really get good use out of it: Facebook
is very useful when everyone in your college was socially obligated to
have it, less so when you have a niche social network only used by open
source enthusiasts.</p>
<p>Before we talk about how or even whether we can or should turn
the tide on this, I’d like to point out a side issue: Mobile. On
iOS and Android, you do download individual client apps. But most
of the time, we use the same model: You use the Instagram app to
connect to Instagram services, and you use no other app for those
services. WhatsApp messages stay on WhatsApp servers. If it’s the
technological layout of the web that makes for this business model,
why has it carried over into mobile?</p>
<p>I remember being excited when mobile came out for the comeback of the
standalone application. There are multiple Twitter apps available,
all posting to the same service and accessing, differently, the same
content. But it hasn’t led to a return to a more federated model
for new software, or openness in general.</p>
<p>There’s a few reasons for this. One is, by the time mobile platforms
started gaining steam, the web revolution had already mostly gone its
course. We’d gotten used to that business model. The assumptions
that are built in to how the web works — that you would get your
client software from the company that also provides the only server
it works with — those assumptions had become entrenched enough that
a different technology landscape didn’t overcome them.</p>
<p>Another is the closed nature of both major mobile platforms. It is
very annoying to put an app on the app store. It is annoying to write
one — historically, it was a quite constrained platform. Apple can
and will reject you arbitrarily. It increases the barrier to entry,
so that established companies have a huge advantage.</p>
<p>But the biggest reason, in my opinion, is that the mobile world
and the web world are too entwined. Not only do we expect to use
many services from the phone on the computer as well, where the
web dominates, but the platforms use the same servers and often,
the same frontend code. It is relatively easy, and commonly done,
to use some or all of the code that normally runs in a web browser,
and instead run it in a browser engine embedded into a mobile app.</p>
<p>So the pattern set by the modern web is deeply entrenched. The end
result is a computer as an endpoint for service. Rather than as a
tool we control and use directly, it is an adaptable terminal that
we use to enter into corporate-controlled environments, where people
make their livelihoods and run their social lives, but the rules can
change at the companies’ whim.</p>
<p>So how do we return to a locally controlled system again? Anti-trust
and regulation isn’t enough — that’ll simply change what companies
we do the interactions with. Getting rid of the web isn’t feasible
and probably still wouldn’t be enough — we’ve thoroughly convinced
ourselves by now that this is how computers are supposed to work.</p>
<p>We need to build an alternative. We need a complete suite of software
that replaces all the needs that websites currently have, but which
do not rely on the same level of centralization. This requires a lot
of work, and while open source software can spontaneously and freely
arise as collaboration between companies when technical concerns are
at play (Linux, compilers, libraries), when it comes to polished and
well-designed products, that usually requires more explicit funding.</p>
<p>So if I were someday, somehow elected president, I would not only
carry out Elizabeth Warren’s noble anti-trust plan. I would also
fund a government program to give grants to build open source software
that could be used this way, with a mission of re-building a computer
culture that doesn’t rely on the same level of centralization and
corporatization. This would be an effective use of tax money, because
what differentiates software from other products is that, once created,
software can be duplicated and re-deployed without any natural cost.</p>
<p>And federated social networks would be a small, relatively
unimportant part of it. What if craftspeople could easily sell
directly to consumers, rather than listing on Etsy? What if cab drivers
didn’t have to sign up for apps that take giant cuts for doing very
little? What if we had time logging and vacation tracking software for
our small companies that actually worked? What if someone didn’t feel
like they had to buy an iPhone so they could Facetime their family,
but could feel confident using whatever phone they wanted?</p>
Just Jumphttps://www.thecodedmessage.com/posts/bar/2019-10-10T00:00:00+00:00Kayleigh needed a break from work.
When you need a break from work, sometimes you go to the bathroom. Sometimes you stop by the coffee machine, chat with a colleague while it brews. And sometimes, you straight-up leave the office and walk to a nearby bar. Today, Kayleigh found herself taking that last option. She didn’t normally do this — she felt that, as the boss, she had to hold herself to a higher standard than anyone else, and drinking before the end of the workday was against policy.<p>Kayleigh needed a break from work.</p>
<p>When you need a break from work, sometimes you go to the bathroom. Sometimes you stop by the
coffee machine, chat with a colleague while it brews. And sometimes, you straight-up leave
the office and walk to a nearby bar. Today, Kayleigh found herself taking that last option.
She didn’t normally do this — she felt that, as the boss, she had to hold herself
to a higher standard than anyone else, and drinking before the end of the workday was
against policy. But today — well, she figured she just really could use a drink.</p>
<p>Kayleigh looked over at the bar. Marble-colored, when she would have preferred a more wooden,
Irish pub, kind of vibe — at least for this situation — but again, you only have so many options
within a quick walk from work at 3PM on a Tuesday. The bartender had a clean look about him,
short-trimmed but still substantial beard, tan — maybe Greek or Lebanese — wearing a black
vest with a colorful bow tie that was definitely part of his personal style, Kayleigh thought,
rather than any sort of uniform requirement. Kayleigh herself was dressed in a tailored button-
down, a vibrant blue tie that she was told brought out her eyes, and a vest — she liked to dress
up for work, especially on a day as important as today was to be.</p>
<p>She walked over to the bar and sat down. The bartender looked at her and smiled, as he dried
a glass in his hands. He set the glass down, and asked, in a cheery, light tenor voice, “What are
we having today?”</p>
<p>Kayleigh thought a second. “Hmm.” What would be a good drink to get at such a venue?</p>
<p>“Could I have an Old Fashioned?”</p>
<p>“Can do!” said the bartender, with a slightly out of place level of enthusiasm. “We make them
good here, I promise you.” The bartender paused a second, and then asked, “What sort of
whiskey do you want with it?”</p>
<p>Kayleigh hadn’t considered this. She looked at the wall with the whiskeys on it, and then said,
“Oh, why the hell not? Why don’t you put in that 18 year Macallen.”</p>
<p>The bartender’s attitude shifted a little. “Look, I don’t really think you want that. First off, I think
it would cost over $100, though I’m not exactly sure. Second, I’m not sure it would even go
well. Third, I mean, like…”</p>
<p>While he was talking, Kayleigh reached into her wallet and pulled out two very fresh-looking
hundred-dollar bills, put them down in front of her, and said, “Make it as carefully as you can.
This might be my last drink.”</p>
<p>The bartender broke eye contact and simply grunted acknowledgement. He looked visibly
uncomfortable as he made the drink, looking back at Kayleigh several times as Kayleigh
blankly stared ahead, still standing. When he got back, he asked, “So, this may be your last
drink, you said? You quitting? Is it a health thing?”</p>
<p>She continued to stare. “Sorry,” continued the bartender, “I know that might be a personal
question…”</p>
<p>“It’s nothing like that,” she answered. Then she looked the bartender directly in the eyes, her
bright blue eyes drilling into his brown ones. “Would you believe me,” she said, in an almost
sing-song, playful, tone, as she leaned to one side and smiled, “if I told you, that it was a
super-clandestine top secret government spy project?”</p>
<p>The bartender chuckled, and made to walk away, but Kayleigh wasn’t done. “Do you believe in
souls?” she asked, and she sipped her drink.</p>
<p>“Souls? Hmm…” responded the bartender.</p>
<p>“Because you see, I’m an atheist,” Kayleigh said.</p>
<p>“Seems reasonable. Most people I know are.”</p>
<p>“But my wife is a Christian,” Kayleigh continued.</p>
<p>“That seems stressful,” the bartender said, neutrally.</p>
<p>“No! We have a very happy marriage!” Kayleigh responded.</p>
<p>The bartender resigned himself to having a longer and certainly more intense conversation than
he had anticipated. He turned to fully face Kayleigh again, put his hands on the bar, and
smiled. “I’m sure you do,” he said, as inoffensively and earnestly as possible.</p>
<p>Kayleigh continued, “So, as I said, I’m an atheist, but my wife’s a Christian. No big deal most of
the time, we have a very happy marriage. She goes to church, has a couple church friends, I
tag along every once in a while, or I sleep in on Sundays, or even get work done. Most of the
time, it doesn’t come up.”</p>
<p>“But where it does come up, see, is that she believes in souls. She believes that each of us has
a soul. And it’s started to make me wonder, you see, if I have a soul.” Kayleigh paused, and
became thoughtful looking again.”</p>
<p>“Ah,” said the bartender. “So you’re thinking about converting. And, um, giving up drinking too,
then?”</p>
<p>Kayleigh blinked. “No, none of that, she would probably be more confused than anything else.”</p>
<p>“And what’s she do?” asked the bartender. “Does she have a very, er, spiritual line of work,
too?”</p>
<p>“That’s the thing!” Kayleigh said, more enthusiastically than expected. “She’s a freaking
neuroscientist. If anyone should have very clear reasons not to believe in souls, it’s one of
those. Her colleagues really don’t understand it, some of them have even told me so.”
The bartender nodded.</p>
<p>“But that’s not what’s important here,” Kayleigh continued. “Do you know about Star Trek?”</p>
<p>“I saw an episode or two,” responded the bartender.</p>
<p>Kayleigh said, “So the teleporter, where it takes you apart and puts you back together
somewhere else, you remember that? Now, isn’t that like killing someone and then building a
new person? Or do they have a soul, outside of them, that isn’t attached to a particular
location?”</p>
<p>“I don’t think I remember it quite like that.”</p>
<p>Kayleigh resumed, “When I was a little girl, my father built a treehouse
for my brother and I. And one day he — I mean my brother — started
jumping straight from the treehouse to the ground. He’d always land
fine, and I was nervous to do it — which was quite an embarrassment
for me, because, you see, I was the older sister.</p>
<p>“And I’d get right up to the end of the platform, and I’d not be able to jump.”</p>
<p>The bartender nodded, confused about the conversation but on more familiar footing now. “I
went bungee jumping once. You just have to do it. Once you do it, it’s suddenly fine.”</p>
<p>“That’s right,” Kayleigh said, nodding. “So finally I did. And I instantly regretted it, but I was
already on the way down. And afterwards, I was fine. I was completely fine.</p>
<p>“And when we went to sleep, you know, they had this prayer they used to teach us. It was a
little poem.</p>
<p>“Now I lay me down to sleep<br>
I pray thee, Lord, my soul to keep<br>
But if I die before I wake<br>
I pray thee, Lord, my soul to take.</p>
<p>“I went through a phase where I’d be terrified to fall asleep. I’d just be terrified. What if I died in
my sleep? But I always fell asleep, and when I woke up the next morning, I’d be fine. Like my
cat. Although, I think, I think I killed my cat…”</p>
<p>Kayleigh was crying at this point. Tears were streaming down her cheeks. She had barely
touched her drink this entire time. The bartender was completely flummoxed, and looked
around, trying to see if anyone else needed anything. Everyone seemed to have drinks, and
one of the couples was now assiduously making out. He could deal with that later. He turned
back to his confusingly distraught customer.</p>
<p>Kayleigh sighed, and reached into her bag and placed two small, black boxes on the table. She
opens one of them. “You want to see what my company built? What I dedicated my life to
research and program and build? The secret government project I led?”</p>
<p>She didn’t wait for an answer, opened one of the boxes up, and lowered her drink into it. “You
take something apart at the source, just convert all its atoms into a digital signal.” She closed
the lid and pressed the button. “Then, on the other end, new atoms and molecules are built
exactly the same way, but they’re new, different atoms and molecules, built out of the air
around the receiver, just arranged according to the digital signal.”</p>
<p>At this point, she opened up the other box, and lifted her drink out of it.
“Cool magic trick,” said the bartender. “My nephew can do that one too, I think. Maybe not
with a drink though.”</p>
<p>“This isn’t a magic trick,” said Kayleigh. She looked around quickly. “It’s an actual teleporter, I
already put my cat through one. And, my cat was destroyed. Turned into a digital signal. It
died. But then, on the other end, he was completely like normal. It’s my turn next. I guess I’ll
die. The current me will be destroyed. But the version of me on the other side won’t think like
that, I guess. It won’t care. It will be a person, I think, but will I die, or will I feel like I’m
jumping? Will that new person actually be me?”</p>
<p>“My wife says it’s OK. My wife says my soul isn’t in the actual atoms, but something about the
structure of the atoms. Then she started talking about philosophy and Platonism and, you
know what, I didn’t understand any of it. I’m a practical woman, you understand. But I led this
project, and I have to do it. Souls or no souls, death or no death.”</p>
<p>At this, she downed her entire drink, and slammed it on the table. She exhaled loudly, made
a slightly awkward fist gesture, picked up her machine, and walked into the bathroom.</p>
<p>Fifteen minutes later, she hadn’t come out of the bathroom. The bartender walked over, knocked
on the door. No response. He tried the knob — it was still locked. He was trying to get Kayleigh
to respond when he saw her walk into the bar.</p>
<p>“It was just like you said,” she said to the bartender. “Once I jumped it was fine.”</p>
The Haskeller's Hungarian Notationhttps://www.thecodedmessage.com/posts/hungarian/2019-08-11T00:00:00+00:00When I was first learning to program, a long time ago, it was in BASIC, and you had to annotate your variable names to indicate what type something is. foo would be a number, whereas foo$ would be a string. This meant that there could only be as many types of information as there were symbols to put after your variable, but that was okay for the sort of programming BASIC was used for.<p>When I was first learning to program, a long time ago, it was in
BASIC, and you had to annotate your variable names to indicate what
type something is. <code>foo</code> would be a number, whereas <code>foo$</code> would be a
string. This meant that there could only be as many types of information
as there were symbols to put after your variable, but that was okay for
the sort of programming BASIC was used for. These were called sigils,
and they helped you keep straight in your head what was going on +++
and made it easier for the computer too. Any aggregates had to be
explicitly declared.</p>
<p>Later on, I learned Perl, which had a similar system, but with a twist.
A variable named <code>$foo</code> could contain a number or a string — or even
some sort of object or reference — but it could only contain one
of them. It was a “scalar.” <code>@foo</code> would contain many scalars with
indices in an array, and <code>%foo</code> would contain many with string or other
keys in a hash map. The computer kept track, dynamically, of the
practical types of the scalars, and could easily do the same
for the aggregate types, but chose to instead enforce a mechanism
where the programmer would be reminded of whether it was a single
value or some sort of aggregate that was being discussed.</p>
<p>In Haskell terms, BASIC had you use sigils for data types, but Perl
had you use sigils for functors. And not to make people too upset by
comparing Haskell and Perl, but Haskellers regularly do the same today,
voluntarily annotating variable names with the functors by convention.
For example, <code>dmdMenuItems</code> might translate, in a Reflex codebase, to
<code>Dynamic</code> of <code>Maybe</code> of <code>Dynamic</code> of list of <code>DomElement</code>.</p>
<p>The usage originally struck me as quite strange, and I didn’t like it.
I remember thinking the original Hungarian notation was redundant:
<code>int iFoo;</code> literally says <code>int</code> right before it. And besides, wasn’t the
point of a type system to not need extra mnemonics, because the compiler
will stop you from messing things up?</p>
<p>At my previous job, we used prefixes like <code>m_</code> and <code>g_</code> in C++ to
indicate scope (member variable/field and global, respectively), and it
similarly took me a while to adapt. In those situations, it turned out to
help because the sigils told you where to look for more information. If
there wasn’t a <code>m_</code>, you looked in the same function, but otherwise you
had to immediately go to the class declaration. But that wasn’t the
only advantage. What scope something was in was important in how you
treated the variable, in many subtle ways that would be bad to confuse,
and which the compiler in C++ wouldn’t really help you with.</p>
<p>Similarly, in Haskell, indicating what <em>functor</em> something is in tells you
something important: What kinds of things can you do to get a regular
value out of it? Do you need to provide a default value (<code>Maybe</code>) or
only provide it to versions of functions adapted for it (<code>Dynamic</code>)
or perhaps just keep the functor around while transforming the
values inside (<code>(<$>)</code>, and <code>(<$$>)</code>, and <code>(<$$$>)</code>…where which
one depends on how many functors). And while the compiler will
help us with this, it’s something it’s convenient to see all the
time, and the types of each individual variable are sometimes
inferred and always not immediately visible in every usage.</p>
<p>And when we do write the pure function or the lambda or the <code>fromMaybe</code>
or the <code>dyn_ $ ffor ...</code>, what variable do we name it now? Many times
we have many variables with the exact same semantic role, the only
difference being what functors they’ve been wrapped with. We want to say
<code>ffor dSelectedId $ \selectedId -> ...</code> or
<code>fmap (\number -> number + 1) eNumber</code> or
<code>let fish = fromMaybe defaultFish mFish</code>. The alternative is, what,
judicious use of <code>'</code> for the different but analogous variables? The
difference between these variables, intuitively, is how wrapped up
in functors they are, and that should also be the difference in
their names.</p>
<p>And I’ve decided this is a good thing. Conventionalized terseness is the least
problematic type of terseness. Single-letter abbreviations are
great if it communicates information efficiently and everyone
agrees on what they mean. I’ve seen <code>dyn</code> and <code>may</code> as well,
and I prefer <code>d</code> and <code>m</code>, as they are easier to stack up without
getting too unwieldy, and besides, <code>dyn</code> is used for functions and
<code>may</code> is also a verb (does <code>mayFish</code> mean something that’s a <code>Maybe Fish</code>
or a boolean about whether you are permitted to fish?)</p>
<p>And so, in spite of my initial skepticism, I’ve come to like this
naming convention, and I recommend it to all of you as well.</p>
The Letter from the Treeshttps://www.thecodedmessage.com/posts/roots/2019-07-22T00:00:00+00:00ENVELOPE HEADER: Date: January 5, 2027 To: Rachel Friedman, President of the United States and Leader of the Free World From: The Roots of the Great Trees of Galaxy-Wide Civilization Subject: An Offer, an Apology, and an Explanation The Offer In the name of the One Almighty God: in the name of the Many Stars through which God is made manifest, in the name of the manifestation of God you call the Sun, and in the name of Original Star from Before Time, we offer you peace, not of a lack of conflict, but of a mutual growth.<pre tabindex="0"><code>ENVELOPE HEADER:
Date: January 5, 2027
To: Rachel Friedman, President of the United States and Leader of the Free World
From: The Roots of the Great Trees of Galaxy-Wide Civilization
Subject: An Offer, an Apology, and an Explanation
</code></pre><h1 id="the-offer">The Offer</h1>
<p>In the name of the One Almighty God: in the name of the Many Stars
through which God is made manifest, in the name of the manifestation
of God you call the Sun, and in the name of Original Star from Before Time,
we offer you peace, not of a lack of conflict, but of a mutual growth.
As branches must look to the vine for sustenance, so must you look
towards us, as your own scriptures say, being a reflection of the truth.</p>
<p>To re-state in a more secular fashion: We are offering your species an
alliance with our species, and an entrance into a great alliance that
currently covers every intelligent species in the galaxy besides yours.
In exchange for a price, which we can negotiate, you may also travel
cheaply and quickly between all worlds of this galaxy, using technology
that our species, the Great Trees, exclusively control but offer to all
intelligent species. Know that no species, when given this offer, has
ever refused it before, and we have no reason to expect your situation
to be any different.</p>
<h2 id="the-benefits-of-our-offer">The Benefits of Our Offer</h2>
<p>It is an unfortunate fact that in many species, experience with evil has
led to a distrust of well-intentioned offers. Therefore, let us explain
first why our offer benefits you, as proof of our good intentions.</p>
<ul>
<li><em>Rapid interstellar transportation:</em> Please see the appendices B-E for
the scientific details: just know that the economic boom in practical
science from merely having read our high-level explanation should
serve as a goodwill token.</li>
<li><em>Trade:</em> This follows as a collorary from the previous point, but bears
re-emphasis. Technology of other species might solve all of your
species most pressing problems.</li>
<li><em>Unprecedented cultural interchange:</em> Though you are greatly advanced
in some fields of art, other species are advanced in forms that you
might not have acknowledged as art. Perhaps it is for this reason that God
has ordained many species to exist!</li>
</ul>
<h2 id="the-terms-of-the-offer">The Terms of the Offer</h2>
<p>Our technology comes at a price. We are not communists, nor are we a charity.
This price is negotiable, and we concede to you the right to make the opening
move in this negotiation, as you have the disadvantage of being the less
advanced species and the species caught off-guard by the offer.</p>
<p>However, it is unfortunate that we must mention that there are certain
non-negotiable requirements. These are part of the price (in one way of
thinking); or, to a higher way of thinking, these are moral necessities
that must be addressed before we can contemplate any settlement with
you.</p>
<p>The non-negotiable requirements are as follows:</p>
<ul>
<li>
<p>You shall immediately become what you call “vegan”. Animal
intelligences are still intelligences, and you must not appropriate
their bodies, their reproductive materials, or their other products
for food.</p>
<p>After some debate, we have determined that “honey” is to be included
in this (see Appendix I). It also will soon become unnecessary.</p>
</li>
<li>
<p>You will immediately prioritize a verification that the plants you eat are
not in any way intelligent, including at a genetic level.</p>
</li>
<li>
<p>You will pay appropriate prices for proper scientific aid and guidance
for developing the technology to consume only non-alive food sources.</p>
</li>
</ul>
<h1 id="an-apology">An Apology</h1>
<p>The reason you are the only unaligned intelligent species in our galaxy is not the
simple one that your science fiction speculates most about: It is not
the case that you have only recently achieved some formal definition of
intelligence, nor that we have only recently learned of your existence. We
have studied you in detail for quite some time and have entire departments
at a few of our universities dedicated to understanding you well, and our
linguistics, being far superior to yours, makes miscommunication unlikely.</p>
<p>It is also not because, as others amongst your writers have speculated,
that you are the most warlike or otherwise morally corrupt of all species
— though this is in fact closer to the truth. You are, after all,
not culpable for your habits, not having ever been taught better.</p>
<p>Until this time, we have shunned you for a very simple reason. It is
distressing enough to us that you must consume other living creatures for
your sustenance, but clearly this is biological necessity and excusable
— and also a situation that can of course easily be remedied through
science.</p>
<p>It is shocking enough that, even though you are not obligate carnivores,
you persist in eating meat — though of course the issue is the
intelligence of the food, not the biological categorization — but this
is not unprecedented among our allies, and simply comes from having been
literally misguided, or rather, not guided at all. Moral instruction
can fix this one.</p>
<p>But that you eat the means of reproduction of other species —
specifically, those items you call eggs and milk — crosses the line
from misguided into disgusting, repulsive, viscerally upsetting, not
only to us but to the vast majority of our allied species. We were
concerned that, even if aided scientifically and guided to moral truth,
these habits were indicative of a deeper sin within your species.
And furthermore, to be frank, we were worried that even if you did
properly reform yourselves, the disgust would linger, knowing
your history, so that we would not be able to deal with you
as intelligent beings, as persons, after the alliance, in
such a way that could destabilize us.</p>
<p>But this is in actuality our sin, not yours. The moral culpability is no
stronger than that of other carnivore species. The Party of the Includers
were right then, deftly distinguishing “misguided” from “sinful,” and we
celebrate their wisdom regularly with a great feast. The new Party of
the Includers are right now, distinguishing “disgusting” from “sinful”
and reminding us that our intolerance does not imply your intolerability.</p>
<p>And so, after a deep and traumatic political realignment, a new
government, a re-rooting to the literal deepest level of the Great
Trees, we repent of our bigotry. We will offer you the same terms as
every other carnivorous species. We will allow you to be like us,
and derive your sustenance not from the destruction of intelligence,
nor even from the consumption of life, but directly from God through
the Stars and their Light.</p>
<h1 id="an-explanation">An Explanation</h1>
<h2 id="ourselves">Ourselves</h2>
<p>We, the Great Trees, are the most technologically, socially, and
civilizationally powerful species. We are the only intelligent species
that naturally derives its sustenance from the light of Stars, the only
“autotrophs” or “heliotrophs” as you call it, and we grow in the dark
parts of the solar systems, where there are no planets to cover with
our shadows nor benefits to the light. We grow, but we do not eat: We
derive our matter from non-living dust, and our energy from that
which God through each Star dispenses freely. We need no planet to root
us, unlike the other species we have encountered.</p>
<p>As such, we are the only species that is morally perfect by gift of God,
and it is our duty to bring this perfection to all species. The original
Party of the Excluders thought that other species were not part of
God’s plan, and to be eliminated, but to eliminate other life would
be counter to what made us, as non-consumers of life, Chosen, and
so we have repented of that viewpoint. If it is God’s providence
to make only us perfect, and to make other species perfect through
us, who are we to question? The creature should not question the
Creator.</p>
<p>We will not even say that we are greater for having received morality
directly, and other species lesser for only having received morality
indirectly through us. The joy of the other species is distinct
from ours, and it all forms a great pastiche. The creature should
not question the Creator.</p>
<p>It is only fitting that, as the root of the morality of all species,
we should also be the root of their technology. If there is danger
of pride in this, it is easily avoided, as these technologies were
also only given by grace of the creator. The creature should not
question the Creator.</p>
<p>Details sufficient to detect our presence within this solar system
are attached in Appendix A. Perhaps ironically, and perhaps as a sign
from God that your position should be accommodated, the solar system in
which you reside is also the capital of our civilization. Your presence,
as the sole unintegrated species, has long been an embarrassment to us.
Why has God chosen to test us with the hardest test near our most precious
Star? But the creature should not question the Creator.</p>
<h2 id="your-history-and-chosenness">Your History and Chosenness</h2>
<p>Among your various peoples, the Jewish people have the tradition of being Chosen.
This was accurate in the time in which the decision was recorded:
God had in fact Chosen them. This is because they, among all the nations,
devised laws for the protection of the animals as they were eaten. The
additional rule that milk not be consumed alongside meat was also a
mitigation of the obvious fact that milk is disgusting, and contributed
greatly to their Chosenness by God. Do not misunderstand: This is the
primary reason for which they were Chosen, and to the extent that
they do not continue among that path, they have lost their Chosenness.</p>
<p>For example, Judaism has retained some of its Chosennness, whereas
Christianity and Islam have lost it. In the modern day, however, it
is rather the Vegans that are the most Chosen (though they do not form
a people in the traditional sense). The Jains, who have taken Hinduism
to its logical extremes, are especially Chosen.</p>
<p>But enough of this! It is no matter! We are extending our own Chosenness
to you! Rejoice, and join the rest of the galaxy! Chosenness need not be
distinctness. We all desire to be Chosen at the expense of the unchosen,
but the creature should not question the Creator.</p>
<h2 id="students-and-teachers-questions-and-leadership">Students and Teachers; Questions and Leadership</h2>
<p>The creature should not question the Creator. On the same note, however,
the students should not question the teachers that the Creator has
appointed for them. Our experience of other species and our research of
yours shows that sometimes, students can be unruly.</p>
<p>They question our requirement of vegetarianism, even if, soon enough,
foods will arise that imitate meat in nutrition and taste (a dubious
value, but one nevertheless easily accommodated).</p>
<p>They question our moral superiority, despite its evident reality to
any creature with a conscience.</p>
<p>They question the very existence of God, despite the Stars as God’s concrete
manifestation, and, on the abstract level, of God’s logical self-evidency.</p>
<p>It is your role as leader to prevent, contain, and counter this
inappropriate questioning. It is disappointing that you have made so
little promise until now in organizing your populace.</p>
<p>We have noticed that your title is “Leader of the Free World.” We would
greatly prefer to deal with a “Leader of the Species,” or, as those who
believe themselves to be the only inhabited planet would have it, the
“Leader of the World.” Alas, in your case we cannot.</p>
<p>Perhaps your inability to become Leader of the World is in the title
itself; perhaps it is due to your insistence that the nations you lead
be free. You subject yourself to an official term limit, and position
your nation as one among many equal nations in an alliance. This is not
an effective way to be a leader nor to have an alliance. A leader is a
representative of God: Our moral scientists can verify for you
(and a proof is included in Appendix F) that your concept of
“popular sovereignty” is not only logically incoherent but morally
depraved.</p>
<p>This is only speculation. Your species’ moral failings fascinate
the minds of many of the greatest scholars. We trust you to address
them in appropriate fashion.</p>
<h1 id="following-up">Following Up</h1>
<p>We do not presume to do your job for you. Our greatest scholars cannot
possibly reach the level of nuance with which you understand your own
species. We shall, however, insist on working with you as opposed to
others for this transition process, as you are God’s clear appointed
representative to lead your species. If you do not respond with
appropriate arrangements for a meeting, we shall arrange one for you.</p>
<p>Arrangements for communication are included in Appendix G. Please
coordinate with whichever scientists brought you this message (the way it
was communicated is fully explained in Appendix H), as they have clearly
been Chosen to mediate between us. For a demonstration of our sincerity,
however, please communicate with whatever scientists you trust most.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Peace be with you, from God through the Chosen species, the Great Trees,
not as your world gives, but as our species alone, through God’s grace,
can give.</p>
Components of a Modern Operating Systemhttps://www.thecodedmessage.com/posts/os_tour/2019-07-11T00:00:00+00:00In previous posts, we discussed historic operating systems and where various OS features come from, but we only gave a brief overview of how they worked.
Now that we have a modern operating system’s full complement of features, we can look at what components need to exist in a modern operating system to get those features. As discussed with MS-DOS, an operating system, even today, is partially code, and partially conventions, like file formats or rules of good behavior – the difference being, that modern operating systems have more ability to enforce some of these conventions.<p>In <a href="https://www.thecodedmessage.com/posts/operating_system/">previous</a> <a href="https://www.thecodedmessage.com/posts/current_os/">posts</a>,
we discussed historic operating systems and where various OS features come
from, but we only gave a brief overview of how they worked.</p>
<p>Now that we have a modern operating system’s full complement of features,
we can look at what components need to exist in a modern operating system
to get those features. As discussed with MS-DOS, an operating system,
even today, is partially code, and partially conventions, like file
formats or rules of good behavior – the difference being, that modern
operating systems have more ability to enforce some of these
conventions.</p>
<p>These conventions are still important. Linux is considered a version of Unix
by the original authors of Unix — even though for legal and trademark reasons
it is not — not because it has any code in common (it doesn’t), but because
it follows the conventions of Unix.</p>
<p>So on our tour we’ll discuss both more concrete software components
that are a body of code, and also conventions that hold the operating
system together at various levels.</p>
<h2 id="the-kernel">The Kernel</h2>
<p>One big problem with the MS-DOS model is that a program could circumvent
its interfaces. It could directly access hardware if it wanted
to, without regard to the OS’s file system code, setting the file system
conventions in stone. A program could install your own procedures to run when
hardware events happened, its own <em>interrupt handlers</em>, and the system
wouldn’t stop you.</p>
<p>This wasn’t really a limitation of MS-DOS per se, but of the 8086,
the processor MS-DOS was designed for. If code is running on an
8086, it can execute any of an 8086’s instructions, no matter what.
A more modern processor – including Intel’s later processors and
therefore most of the processors MS-DOS ran on in practice –
has a distinction between <em>user</em> mode and a <em>supervisor mode</em>, which
will only allow hardware access to take place while the
processor is in the supervisor mode (also known as <em>kernel mode</em>).</p>
<p>Application code, regular program code, will all run in user mode.
A lot of operating system code can as well: How much code should
be actually run in kernel mode as opposed to user mode is a
complicated design decision. Certain instructions in the processor
are only allowed in kernel mode, including those that control what
memory is <em>mapped</em>, or currently accessible, those that install
interrupt handlers, and those that control which pieces of hardware
the processor is currently permitted to send data to.</p>
<p>In MS-DOS, all code was functionally in kernel mode – or more precisely,
in a legacy mode of the Intel processor that emulated a time when the
distinction didn’t exist, and all instructions were always allowed. A
separate mode, referenced above, put the processor into a different legacy
mode where it also acted like an 8086, but invoked special procedures
whenever the program executed a privileged instruction, basically allowing
MS-DOS to run inside a sandbox inside a larger operating system (I’ve
used both Windows and Linux as the larger operating system in this model).</p>
<p>Unlike MS-DOS, a modern operating system will have controls on what is
allowed to run in kernel mode, and everything else must run instead in
user mode. The body of code that is intended to run in kernel mode is
known as <em>the kernel</em>, or <em>kernel code</em>. If someone asks you what an
operating system kernel is, this is the answer — the set of code that
runs in kernel mode. It might be stored in multiple files, it might be
all in one file, and it might be divided into internal components with
different names, but that is what the kernel is.</p>
<p>So, if only the kernel can access hardware directly, and most code isn’t
allowed to be in the kernel, then how does a normal application access
the hardware? Well, instead of accessing it directly, the application
must ask the operating system to do the thing on its behalf. Just as
the operating system can install procedures as interrupt handlers,
for the processor to trigger in case of hardware events, it can
install system call handlers, procedures that run in kernel mode
but can be invoked in user mode. These procedures will be designed
to make sure that the user program in question is accessing the
hardware in an acceptable way, and only perform the operation
if it is allowed — possibly, there will be no reasonable
way for the program to even request an impermissable hardware
operation.</p>
<p>This is a key distinction between MS-DOS and even older Mac operating
systems: whereas all operating systems provide abstractions, those with
an OS kernel can provide <em>mandatory</em> abstractions. This means that,
if you want to support new features, you can change what the system
calls do, and all programs will automatically adapt to it. If your
file system is suddenly stored over the network, programs won’t get
tripped up trying to access the hard drive directly. The operating
system can insert itself at the level of the system call interface
and redirect your request to the network instead — if the
system call interface is well-designed.</p>
<h2 id="the-application-binary-interface">The Application Binary Interface</h2>
<p>So let’s say you have a Windows program, and you want to run it on Linux.
Or you have a Linux program, and you want to run it on macOS, which
are both Unixes and have a better chance of being compatible. It won’t
work — certainly not “out of the box.”</p>
<p>Why? Well, one reason is mentioned above. Different operating systems
provide different ways of organizing the functionality of the computer
into system calls. They provide different abstractions, which are nowadays
mandatory.</p>
<p>For example, on Windows, different drives use different letters, and
volumes shared over the network are also assigned letters, e.g. the famous
<code>C:</code> drive, or <code>A:</code> for floppies, or <code>X:</code> maybe for a shared drive. On
Unixes, different volumes — Unix doesn’t use the word “drive” as often
— are assigned different <em>mount points</em> within the system. One volume
might be <code>/</code>, and another at <code>/home</code>, and another <code>/mnt/network</code>, and
it would provide the illusion of one unified hierarchical filesystem.
Imagine if you had — as a simplified example — a system call to
assign a drive letter to a network share. This would make sense with
the Windows abstraction, but what would it even mean on Linux?</p>
<p>Another reason has to do with how programs are stored on the drive.
Programs are not just a list of instructions for the processor.
They usually have to be loaded at a particular address. Memory
must be mapped for them to store their variables — and how much
memory varies program by program. They have to load libraries
of other procedures, which may be stored separately through
<em>dynamic linking</em> in a <em>shared library</em> (Unix terminology, <code>.so</code>)
or a <em>dynamically loadable library</em> (Windows terminology, <code>.dll</code>),
which is also going to be mapped at a certain address in
memory according to arcane rules.</p>
<p>Different operating systems have different <em>binary file formats</em>, or
formats for storing programs (which are often called <em>binaries</em> when stored
on disk, although everything a disk stores is in binary). Linux has
ELF (Executable and Linkable Format, which can use DWARF to store its debugging
information), Windows has PE (standing for portable executable, which falsely
implies it runs on more systems besides just Windows). Different Unix
varieties have different binary file formats — it’s something that
evolves over time. Some operating systems — many operating systems +++
have different binary formats supported, for backwards-compatibility,
or for simulating other operating system, or even for different types
of programs or programs written in different program languages.</p>
<p>The combination of the set of available system calls, the available
libraries on the system, and the format of the binaries, constitute
the main blocker to compatibility between operating systems, the
<em>ABI</em> or <em>application binary interface</em>, an acronym or phrase that is
intended to sum up everything that needs to match for <em>binary
compatibility</em>, the ability to run binaries (compiled programs
as they are usually stored for running) from one system on
another.</p>
<h2 id="the-application-programming-interfaces">The Application Programming Interface(s)</h2>
<p>There are other kinds of compatibility. Even though you can’t take the
Windows version of a program and run it on macOS, we see plenty of programs
that have versions available, right on their website, for both Windows and
macOS. Similarly, most phone apps are available in both the iPhone and Android
stores.</p>
<p>In some cases, that’s because there’s two applications, written by
different teams, that solve the same problem (and have the same branding)
or interact with the same servers (which run on Linux and where all the
complex stuff happens anyway). But in others, it is substantially the
same program that is run on both systems.</p>
<p>In many cases, though, that’s because the versions were written sharing
a lot of the same <em>source code</em>, with a layer of software interfacing
between that and the specific operating systems in question. This
might be because there were different teams (or people) who maintained
compatibility layers proprietary to that company (this is what many
traditional software vendors do and have done in the past). Nowadays,
it is more likely because there was a programming language that has
implementations available on both platforms, and versions of the same
library functions available for each (which is what Java was originally
famous for and what Python does today).</p>
<p>This is fairly common for relatively new programming languages, where
the program language was written after the operating system was already
around, and where part of the point of the programming langauge is to
support multiple operating systems for your programs. For programming in
an operating systems “native language,” so to speak – for programming
in C on Linux or Objective-C on macOS, it’s a bit harder: An Objective-C
macOS program is unlikely to be particularly portable to anything (except
maybe iOS).</p>
<p>There are some exceptions to this. A program written for Linux can usually
be made to run on macOS, because of their common Unix heritage. Even
though Linux and macOS have different ABIs or application <em>binary</em>
interfaces, they have very similar <em>APIs</em>, which stands for application
<em>programming</em> interface (NB: This term means something different in a
modern, web programming context). This means that, although they are not
very binary compatible, they are <em>source compatible</em>, or close to it,
which is to say, that there are few changes to the source code you would
have to make to a Linux C-based program to make it work on macOS. It
might be invoking different system calls with different identification
numbers when you write the code to open and read a file, but that code
looks exactly identical on both platforms, possibly something like this:</p>
<pre tabindex="0"><code> // Simultaneously both Linux and macOS C code
int file_descriptor = open(filename, O_RDONLY);
ssize_t res = read(file_descriptor, buffer, sizeof buffer);
</code></pre><p>As you might have picked up, this applies to only a subset of the
functionality. Any GUI-related code would not enjoy this level of
portability — macOS and Linux have very different GUIs. More likely
this is code intended to be primary run on servers (and perhaps run
on a Mac for testing), or code used by programmers (like <code>git</code> and
other development tools designed to be run from the command line)
or by scientists or other researchers (like the non-GUI components
of Matlab and R or even Python).</p>
<p>The baseline API that all Unix-like operating systems have in common is
called POSIX. Operating systems are certified as brand-name Unixes based
on a bigger API specification, with more functions and more requirements,
called X/Open — which is to say that Unix is defined not by where the
code originated nor by its ABI, but rather by its C programming API. To
be clear, an operating system based on Linux could probably pass X/Open
and become legally a Unix, but nobody has decided to spend the time and
money to try and make this certification happen. It is the fact that
it is as close as it is that leads many of the original developers of
Unix to consider Linux “a Unix,” as it is this API that ties the Unix
family together.</p>
<p>The Unix/Linux API is so important that Microsoft needed to add it to
Windows and that macOS’s native use of it is considered a selling point,
especially for developers. This is because a lot of server software
and programmer tools assumes this Unix API (as well as, for example,
Unix filesystem conventions), or else it assumes Linux which has few
enough peculiar features to make much of a difference. Most users are
isolated from this, but anyone who has to write software to run on servers
(which is most programmers) or use programmer tools (which is
all programmers) is very keenly aware of this.</p>
<p>This Unix API is a core API provided by the operating system itself,
the official, default way for applications to be written, but the other
programming interfaces discussed above are also APIs. That is to say, Java
comes with its own API that it brings to every operating system it runs
on, leading to it its once-famous “write once, run everywhere” slogan.</p>
<p>The most important API for application compatibility today is something
irrelevant to most of this discussion though, and relatively new
to operating system history. Most applications that run on your
computer today run in Javascript in the very controlled environment
of a web browser. Part of what a web browser does is provide a stable,
cross-platform (that is, multi-operating system) API for the portion of
a web application that runs on each local computer. This interface is
so important that many modern apps for phone and desktop are internally
implemented as running inside a web browser, or something that resembles
a web browser in more or fewer ways.</p>
<h2 id="the-system-librarylibraries">The System Library/Libraries</h2>
<p>We spoke in the last section about the POSIX or Unix APIs. There are
a lot of functions that a Unix-like operating system is expected to
provide functionality for, in a lot of domains. Some, like opening or
reading files, more or less have to be implemented as system calls,
at least the most basic versions of them. Others, like calculating a
square root, are simply procedures that run in user mode. Still others,
like printing a number to the console, have to involve some system calls
(to output text to the screen) but also some computation appropriate
for user mode (to convert the number into a string of digits).</p>
<p>To provide these functions, Unix-like systems will provide their
own version of the C standard library. On most Unix systems, this is
maintained by the same organization that maintains the kernel, with Linux
as the major exception. The set of POSIX APIs that a Unix will maintain is
implemented through the standard library — some of them system calls,
some of them implemented in user mode, and the programmer doesn’t have
to care which.</p>
<p>In fact, between versions of the same operating system, and certainly
between different operating systems, what used to be a system call might
become a wrapper around a new, more advanced system call interface, where
basically the library is providing compatibility with other versions
of the same operating system. This is especially important in Unix, as
there’s a lot of calls descended from different branches of the family
tree with slightly different <em>semantics</em>, or subtleties of meaning,
all of which are used by modern programmers, who can use whichever is
more convenient to them or simply preferred.</p>
<p>The library enables source compatibility and API compatibility, even
in situations where the kernel itself is much more particular about
its system calls. The question is, where does the ABI compatibility
layer go? On Linux, the kernel itself is responsible that its updates
don’t break working programs, and its founding and lead developer Linus
Torvalds is adament and dictatorial — sometimes abusively so — about
this rule. If you want a system call to behave differently, what you
do in these situations is actually make a new system call that behaves
the new way, and leave the old system call available at the old number
in case a program wants to use it.</p>
<p>However, all modern operating systems support dynamic linking.
This means that the libraries and the main program binary are stored
in separate files, and the main program binary specifies the <em>names</em>
of the functions it calls, rather than using numbers. If all programs
use dynamic linking, and only call system calls through the library,
you can update the library to use a different system call interface,
and change the kernel along with it. This is what macOS requires +++
while it is technically possible on macOS to bypass the library to call a
system call, the attitude is, that if you do that, you should not expect
your program to work as expected. The operating system will still ensure
it won’t break other programs, but will not guarantee your program to
behave the same from version to version.</p>
<p>These are two vastly different approaches to maintaining ABI
compatibility. In making the standard library part of the ABI, macOS
doesn’t allow <em>static linking</em>, where all code in a process comes from a
single file and a copy of the libraries are placed into the main binary
when you compile it. It’s not only not recommended — by default, it
will not even run statically compiled binaries. If you want to have an
alternative version of the C library, you can’t. If you’re writing in
another programming language that doesn’t work like C, you still have
to go through the C library to talk to the operating system, which isn’t
written necessarily with other programming languages in mind.</p>
<p>But, the kernel developers have the ability to control their system call
interface better. If they want to add a new system call, they can make
their old way of doing it call the new system call, and keep the kernel
cleaner. This is important because all code in the kernel constitutes a
greater level of vulnerability — if a kernel accesses unmapped memory,
it’s generally a <em>kernel panic</em> (the Blue Screen of Death on Windows),
but a user process will just crash with a <em>segmentation fault</em>. Or worse,
if you exploit a vulnerability in the kernel and manage to manipulate
it into doing something for you it wasn’t supposed to, it can literally
do everything on your computer. This as opposed to a regular program,
which still can only do things the kernel permits it to.</p>
<p>Linux, on the other hand, has more flexibility. You can have statically
linked files, your own C library, or libraries specialized for other
programming languages. You can avoid all the baggage that comes with
its implementation of the C library functions that have nothing to do
with system calls.</p>
<p>Honestly, my preference would be somewhere in between. I’d have a smaller
library than <code>libc</code> — maybe <code>libsystem</code> — that every program would
be automatically dynamically linked to. This would be for things that
are usually implemented as system calls, or that were system calls in
previous versions of the operating system. These would be things that any
programming language might reasonably want to use. The more C-specific
stuff would be relegated to its own, more general library. <code>libsystem</code>
would be as simple as possible.</p>
<p>Libraries that form part of the main API and that are provided with
any installation of the operating system definitely constitute
part of the operating system. Libraries that come bundled with
specific application or that exist to do certain program
tasks are not part of the operating system. Which count
as core operating system functionality is up to the operating
system vendor, but all operating systems come with at
least some libraries, to abstract their austere system
call interfaces into something that you can actually
program.</p>
<h2 id="the-shell-command-line">The Shell (Command Line)</h2>
<p>All modern (non-mobile) operating systems come with a command line
interface, whether on the computer or on the server. When you type
commands into the command line interface, it isn’t the kernel itself that
reads the line you typed and decides how to proceed. Instead, a separate
process does that. This process is key to the core job of an operating
system — letting you run multiple programs and share resources between
them — and therefore counts as part of the operating system, but is
also not part of the kernel.</p>
<p>The concept of having the shell be a user process like any other was
actually one of the early innovations of Unix over other contemporary
operating systems. Before that, the kernel would often be responsible
for this. By removing it from the kernel, Unix allows different users
to use different shells, with different syntax for advanced features
like scripting or running commands conditionally on the results
of other commands. Even Windows has two shells now, traditional
<code>cmd</code> and its newer “object oriented” PowerShell.</p>
<p>All shells can run any terminal-oriented program, and usually can also
be used as a starting point to launch graphical programs when the system
supports it, i.e., when it’s a desktop OS and not a server OS.</p>
<h2 id="the-shell-gui">The Shell (GUI)</h2>
<p>Not all modern operating systems have GUIs. Remember that many computers
are servers (or embedded devices) where you don’t actually sit at a
monitor and keyboard — where they likely don’t even have a monitor and
keyboard. But for those that do, the concept of shell can be generalized
to the program from which you run other programs.</p>
<p>On macOS this is called Finder, and it dates back to the early pre-Unix
Macintoshes. On Windows this is called Windows Explorer. On Linux,
and other Unixes that share Linux’s user interface philosophy, there
are multiple <em>desktop environments</em> available, each of which handles
program launching differently, and each of which usually comes bundled
with a <em>window manager</em> that draws decorations around your windows and
allows you to minimize, maximize, tile or overlap them. This leads to
a rich diversity of Linux systems in their appearance and casual
use.</p>
<p>It is usually these graphical shells, these desktop environments,
that form your mental image of what an “operating system” is. But that
can be misleading. Linux can have one of many different graphical user
interfaces — or none at all — and most of what makes Linux Linux
will be the same.</p>
<p>So what about Linux, Android, and ChromeOS? Are they the same operating
system then, because they all share the same kernel? Linux and Android
differ at a deeper level than a shell. An Android program can’t be run
on a normal Linux distribution without some layer to accommodate the
additional libraries, and vice versa. The different desktop environments
on Linux all tend to be compatible with X, a unified protocol for
UI interactions, and the many command line shells all run the same
set of command line utilities, but Android display is not done through X.</p>
<p>In the case of ChromeOS, the situation is different. The shell in ChromeOS
is basically the Google Chrome browser, which is the same thing that
on other platforms acts as a single program in a larger context. So
many programs nowadays are run through the medium of the browser that
it’s become more than a single program in practice — many people only
open the browser on their computer and use that for all or almost all of
their computer-oriented tasks: one tab open to GMail for their e-mail;
one tab open to Twitter; one to Spotify, to play the background music;
another to Slack to talk with their colleagues; and finally yet another
to Google Docs to do the actual productive work of writing whatever it
is they’re writing. Is Chrome a shell in practice on these other operating
systems? Is it just an annoyance for some users that there is the taskbar
to switch between multiple programs, in addition to the tab bar to switch
between multiple websites? Google certainly thinks this is true for some
users, and it is for them that the Chromebook is intended.</p>
Extra Versionhttps://www.thecodedmessage.com/posts/extra_version/2019-06-26T00:00:00+00:00There’s a lot of books and articles out there about how to interact with, or be, an introvert. Society really looks down on introverts, we hear, and even when it doesn’t, it certainly isn’t designed to be navigated by introverts. They’re a very misunderstood bunch, but they have a lot to contribute. Here’s how you can properly cherish them, etc. etc.
Katelin couldn’t relate to any of this at all. Society not designed for introverts — bullshit!<p>There’s a lot of books and articles out there about how to interact
with, or be, an introvert. Society really looks down on introverts, we
hear, and even when it doesn’t, it certainly isn’t designed to be
navigated by introverts. They’re a very misunderstood bunch, but they
have a lot to contribute. Here’s how you can properly cherish them,
etc. etc.</p>
<p>Katelin couldn’t relate to any of this at all. Society not
designed for introverts — bullshit!</p>
<p>She hated being alone. She
hated that sometimes her roommates didn’t want to chat. She
hated that sometimes she wasn’t able to finagle dinner and
drinks plans, or sometimes the people she went out with went
home earlier than her. She hated that sometimes she was awake,
and the nearby bar was empty — or that she was too tired to
go to it, in any case. Certainly, if she was alone, she would
always have the television on, even when she was sleeping. And
if she was exhausted from being at a party too long, where she
felt odd, she’d unwind by calling up a friend on the phone to
complain about it!</p>
<p>Where were the articles about how to survive as such a
raging extrovert? Where were the articles on how to deal with
well-meaning colleagues who decided she needed more space,
more quiet in order to work?</p>
<p>Then one day she tried shrooms.</p>
<p>She had bought some from a website in the Netherlands. She
tried to get friends to do them with her — obviously — but
they were all horrible cowards. In a move that was very out of
character for her, she ended up saying “fuck it,” carving
out an evening for herself (and turning down opportunities to
hang out, somehow!) and doing them on her own.</p>
<p>Normally, when she was alone, she’d drive herself into a
near panic, but this time, something else happened. She thought
about life, and how she wasn’t where she wanted to be at 29,
and… she didn’t feel like she was about to scream. She
instead just thought she could push through it, process things
slowly. She had the experience of hearing all the scary thoughts,
but neutered; how had she missed out on this before? Finally,
she did something she hadn’t done in years. She went to her
bookshelf, pulled out a book, and started to read.</p>
<p>And the book didn’t argue with her. The book didn’t waste her time
with nonsense she didn’t want to hear about. The book didn’t try to
one-up her or play games or cause drama. It just was, and she was happy
with it.</p>
Father, Forgive Themhttps://www.thecodedmessage.com/posts/father_forgive_them/2019-06-20T00:00:00+00:00Father, forgive them, for they do not know what they are doing.
Jesus, on the cross (Luke 23:34) My grandfather always used to love telling a certain anecdote about Calvin Coolidge. He was a man of such few words that one time, President Coolidge went to hear a world-famous preacher preach. Upon returning from the sermon, his wife asked what it was about. He replied “sin.” Not satisfied with the answer, the wife asked, “Well, what did the preacher have to say about sin?<blockquote>
<p>Father, forgive them, for they do not know what they are doing.</p>
<ul>
<li>Jesus, on the cross (Luke 23:34)</li>
</ul>
</blockquote>
<p>My grandfather always used to love telling a certain anecdote about
Calvin Coolidge. He was a man of such few words that one time, President
Coolidge went to hear a world-famous preacher preach. Upon returning from
the sermon, his wife asked what it was about. He replied “sin.” Not
satisfied with the answer, the wife asked, “Well, what did the preacher
have to say about sin?” The response: “He’s against it.”</p>
<p>It was a running joke every time my grandfather came home from church
— like many older members of our congregation, he tended to go to the
shorter, more convenient Saturday evening service, and when he got home,
my father would try and get a preview of what the Sunday sermon might
be about. My grandfather’s answer: “Sin, and he’s against it.”</p>
<p>The joke is that all sermons are about sin. All the preachers are
against it.</p>
<p>But as a child, I noticed there was something slightly off about this
joke. At my church, the sermon often focused on topics besides the
preacher’s, or even God’s, opposition to sin. Which is a good thing,
because that’s not the focus of Christianity, either.</p>
<p>Trying to force people to become better is something we leave to the
government and to the police. But even at their very best, when they are
doing the most good they can do, they only do a superficial job. The
police, at their best, can get people to not steal for fear of being
arrested. Ideally, we want people to not steal because they know that
stealing hurts people.</p>
<p>There are many people, in many traditions and faiths, whose religious
beliefs can be summarized as “We must not do bad things, because
otherwise God will be angry, and will punish us.” There are even some
Christians who think like that.</p>
<p>But Christians should know better. Christians should know that the
summary of the Gospel is not “punish the evildoer” or “God is a
better, omniscient policeman” but rather “For God so loved the world,
that He gave His only-begotten Son, that whoever believes in Him [ –
whoever trusts in Him – ] shall not perish, but have eternal life.”</p>
<p>We have a God who loves us. We have a God who loves us so much that
He talks about forgiveness while He is being murdered. This is a good
message for me, because I have no trouble believing God exists. I do,
however, regularly have trouble believing, as the Bible says, that God
is love, that God loves me.</p>
<p>Now, I’d like to make clear that God is still against sin, that most
preachers are still against sin – it’s just not the primary message
God has for us. And this is where I’d like to move on to the second
part of Jesus’s quote: “for they do not know what they are doing.”</p>
<p>Jesus was specifically speaking about the soldiers who, unaware that Jesus
was the Son of God, were doing their jobs in executing Him. But more
generally, He was talking about everyone involved in killing Him. And,
even more generally, since Jesus specifically says that harm we do to
each other is harm done to Him, I wonder if he wasn’t talking about
all of us who harm each other, saying “Father, forgive them, for they
do not know what they are doing.”</p>
<p>Now, when I started thinking about this, this shocked me, because it
seemed like God was giving us an excuse. And I’m pretty sure God isn’t
keen on excuses for hurting other people; Jesus said “if your eye causes
you to sin, cut it out and throw it away,” showing that people who blame
body parts for their behavior are putting the blame in the wrong place.</p>
<p>God doesn’t need to have an excuse to forgive us. God’s forgiveness
doesn’t come from a place of “what they were doing wasn’t all that
bad.” The horrible things that we, as human beings, each and every
one of us do to each other actually are all that bad, as demonstrated
by the fact that when a perfectly loving person comes into our midst,
our reaction is to kill Him.</p>
<p>But I think it does say that, just as the soldiers didn’t know that the
Person they were executing was the Son of God, we don’t fully process
the consequences of our actions. We don’t fully see, and we forget,
that the people we treat with disrespect are made in God’s image,
that they are fully alive and conscious as we are. We don’t fully see,
and we forget, what deep consequences our words can have.</p>
<p>And we need to do better. Not because God will be angry and unable to
forgive us if we don’t, but because God does forgive us, and because
other people matter. And, if we put our trust in God, he will forgive us,
and transform us, not through fear, but through love.</p>
<p><em>I originally gave this as my portion of a series of meditations on the
last “words” or statements of Jesus, when I was asked to do that at my
church on Good Friday, 2017.</em></p>
Experiences in Switzerlandhttps://www.thecodedmessage.com/posts/swiss/2019-06-19T00:00:00+00:00Just wanted to write up a summary of random notes from my Switzerland trip, not including the conference which was also a lot of fun but I think less interesting for my non-programmer friends, slash it might make for a better separate post.
SIM set up It was relatively easy to buy a Swisscom SIM card in the airport, although they did not offer to set it up in my phone for me.<p>Just wanted to write up a summary of random notes from my Switzerland
trip, not including the <a href="https://zfoh.ch/zurihac2019/">conference</a> which
was also a lot of fun but I think less interesting for my non-programmer
friends, slash it might make for a better separate post.</p>
<h1 id="sim-set-up">SIM set up</h1>
<p>It was relatively easy to buy a Swisscom SIM card in the airport, although they
did not offer to set it up in my phone for me. This would’ve been useful, as it
turns out my phone was locked (which is more an idiosyncracy of the US as opposed
to Switzerland). I instead ended up purchasing a mobile hotspot (the German word
for which, I was told, was “Mobile Hotspot”), which was easy to set up and worked
perfectly with my phone.</p>
<h1 id="bicycling">Bicycling</h1>
<p><img src="https://www.thecodedmessage.com/images/publibike.jpg" alt="PubliBike"></p>
<p>The bikeshare app I ended up installing was PubliBike. Bikes here are for
some reason known as “Velos,” which I can’t find in any German dictionary
but apparently is from the French word. There are many signs up all over
the place informing you that you can’t leave your Velo on this wall or
that railing, some of which include threatening pictures of them being
taken away by a truck.</p>
<p>PubliBike works via BlueTooth, which was frustrating at first because
I knew that my cheap Samsung, in addition to being locked, has bad
BlueTooth support. I thought this was going to keep me from using the
system at all, because I also didn’t realize when they said to hold your
phone 20cm away from the lock, that you needed to make sure it was far
enough away from the lock as well as close enough. I didn’t understand
at first why they used BlueTooth rather than using the Internet and a
code (like CitiBike does) until I read the troubleshooting website +++
it doesn’t require the Internet to work. Given that my mobile hotspot
is often low on battery (is anyone who knows me surprised by this?),
this seems to have been the correct decision.</p>
<p>It works by locking and unlocking the rear wheel. The stations just have
the bikes up on kickstands — no docks, unlike CitiBike. I feel like that
should be a good way to get 10% of your fancy new Velos stolen each night,
but it seems to work. I guess they probably have security cameras in the
actual stations. The wheel-lock system does have its advantages though:
you can lock this partway along your ride to go into a shop or a public
restroom (marked by WC everywhere). This was extremely convenient until
it failed to re-unlock with my crappy Bluetooth, and I had to carry the
bike 5 blocks on the street while being worried I looked like a thief.</p>
<p>I don’t know some of the biking laws here but it seems intuitive
enough. The surprisingly big gotcha is that it’s harder to tell if streets
are one-way here as opposed to New York City due to the relative lack of
street parking. I didn’t realize how much I relied on which way the cars
are facing to determine which way it might be okay to bike. It doesn’t
help that often streetcars and bikes are allowed to go both ways
on an otherwise oneway street.</p>
<p>One welcome difference is that everyone seems to be more okay with
riding on the sidewalks here — certainly many of the bike paths are
on sidewalks rather than on the road itself, and many of the sidewalks
that don’t have a specific section for bike paths nevertheless have
signs indicating that cyclists should use them.</p>
<p>Some street lanes are the same system, no sharrows but lines directing
you onto the lanes, and many of them are too narrow for a car and a bike
by NY standards, and yet cars will pass you with barely any extra room,
which is scary.</p>
<p>Also, the rails for the “busses” are everywhere. Be careful and don’t
try to move across them too slowly or subtly, or your wheel will get
abruptly stuck, which is inconvenient, especially if you’re going quickly.</p>
<p><img src="https://www.thecodedmessage.com/images/stations.jpg" alt="PubliBike stations!"></p>
<p>The PubliBike stations seem reasonably spaced, but I suspect they don’t
go as far out into the outer regions of the city as CitiBike does.
Then again, I don’t know how much outer regions Zürich really has
compared to New York.</p>
<p>I’m trying to figure out if cycling is safer and a bigger deal here
than in New York. There seem to be about the same number of cyclists
around, but fewer people, so I think that makes for a higher proportion.
I haven’t actually checked the data, but I was certainly happy to
see a sign (confusingly on the <em>outside</em> of a bus) telling you
to check for bicycles when leaving the bus. Hopefully they have
them on the inside too — I wouldn’t know, I’ve been biking instead of
using the bus.</p>
<h1 id="bathrooms">Bathrooms</h1>
<p>There are more public bathrooms than in NYC, and unlike in the US there
are signs telling people where to find them for parks where it’s a
common need. They are willing to charge for them when that helps with
maintenance, although some park bathrooms were both free and clean. I
wish NYC had more paid-for public bathrooms. I don’t want to have to
pretend to want to buy a coffee, insult the proprieters by asking if
they have a bathroom first, etc. when what I really want is a bathroom.</p>
<h1 id="fountains">Fountains</h1>
<p><img src="https://www.thecodedmessage.com/images/fountain.jpg" alt="Fountain!"></p>
<p>Similarly, public drinking fountains are a great amenity. Another American I
met at a bar did not realize you could drink from them — to American
eyes they just look decorative. But they are in fact full of clean water,
many on a different system from the main tap water system in case some
attack is made that would render the main system undrinkable. I’ve
seen plenty of people refilling bottles with them, and a few, like me,
just drinking from them. The one closest to the train station has
“TRINKWASSER” (drinking water) written on it in big letters, with
appropriate accompanying iconography, I guess in case foreigners
are otherwise confused.</p>
<p>The water runs continuously. This contributes to the impression that
they’re decorative, as in the US you have to press a button to get
fountain water. I don’t entirely know why this is in a place like NYC,
as we’re not really short on water in any significant way. Certainly
Zürich isn’t. “Wasting water” when you’re immediately next to a
humongous clean lake isn’t a thing. If you didn’t drink it, it would
have simply flowed to the ocean some other way.</p>
<h1 id="other-transit">Other Transit</h1>
<p>The light rail/trams are called busses (Busse) which is how they act,
except that they have little connections to a power source and rails
that they go on. There are special traffic lights to announce their
comings and goings.</p>
<p>The trains proper have a well-organized system. Tickets can be bought
as valid for 2 hours or for 12 hours, and are checked randomly on pains
of severe fines. Round trip tickets are impossible in such a system,
and the reasons why are left as an exercise to the reader.</p>
<h1 id="church">Church</h1>
<p>I went to a Lutheran church on the Feast of Pentecost, and I should have
remembered from church growing up that that was the traditional Lutheran
time to do Confirmation. Lo and behold, there wasa a class of confirmands
and the church was jam packed with proud parents and relatives. It was
clear they were not used to having this many attendees, and I sat on an
extra chair that had been brought out.</p>
<p>They said the more old-fashioned and literal “und mit deinem Geist”
instead of some version of “and also with you” in response to the pastor’s
exclamations of “The Lord be with you.” I don’t know what makes Americans
so uncomfortable with this, but I imagine it has something to do with our
obsession with informality and egalitarianism (“and with your spirit”
is traditionally only said to an ordained person and it references the
special presence of the Holy Spirit in ordination).</p>
<p>They seemed in some ways extremely low church (no one crossed themselves
at any blessing, not even when one of the pastors explicitly made the
sign of the cross over the congregation) and in some ways high church
(the Words of Institution were chanted, the liturgy was crystal clear
in affirming the Real Presence of Christ). I found this a bit confusing.</p>
<p>It was also difficult to tell how a normal service would have been. The
Eucharistic Liturgy was definitely abbreviated (and also not published
in the bulletin, and much to my dismay their version of the Lord’s Prayer
was not exactly the same as the German version I knew), because of the
confirmation. A lot of time was dedicated to the pastors giving personal
messages to the confirmands, and some of the confirmands talking about
what Christianity meant to them.</p>
<p>The sermon itself was also about Confirmation (with some reference to
Pentecost as the birthday of the Church), about how God’s love should
set you free from caring about what people think about you, with an
illustration of that point being based on Justin Bieber’s song
“I don’t care,” which was played over the sound system. The preacher
explained that, taken with God instead of his “babe,” this was a
perfect attitude to have with Confirmation.</p>
<p>Besides the random Justin Bieber, and one hymn where one of the pastors
played a guitar<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup>, the music was traditional. The congregation sang
competently but not amazingly, the musical notation was printed in
the bulletin but apparently would normally be in hymnals (which
in good Lutheran tradition also contained the liturgy in different
musical settings and were presented as gifts to the confirmands), and
it was accompanied by an organist and a recorder<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> player.</p>
<h1 id="cultural-issues">Cultural Issues</h1>
<p>The big debate of the moment was over the “women’s strike” (Frauenstreik)
that happened the Friday after I arrived. There was a debate about it
in one of the magazines between two women. The criticism was that apparently,
many of the signs at such events are anti-men, and that it seems overkill
when women have plenty of rights and things are heading in the right
direction. The woman who was in favor then talked about rape, and much
to my shock, the woman who was against it said that rapes would go away
if “people not fully integrated” i.e. Muslims would go away <sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>.</p>
<p>There was also an expert on happiness in another magazine who was asked,
among other things, about some or another ranking that had listed Switzerland
as in the top 5 happiest countries. His claim was that Swiss people were
simply too conscientious to admit how unhappy they were, that they
felt guilty for being unhappy as they were where they were in objectively
one of the richest countries in the world.</p>
<p>I think a lot of this sort of thing must certainly influence happiness
by survey. First world guilt is a real thing. And how much people
lie on surveys or even how much they adjust their concept of what
it means to be happy culturally must vary tremendously from culture
to culture. Every time I see such rankings (usually used by Americans
to idolize Europe), I wince and roll my eyes. It’s nice to see an
expert on happiness similarly drawing criticism to them.</p>
<p>Most of the discussion in the magazine seemed to be about feminism (like
the US, but a different tone) and about the environment (and relatedly,
about autonomous vehicles, the future of city planning, and bicycles).</p>
<h1 id="food">Food</h1>
<p>I mostly ate bratwursts that game with ridiculously dense bread,
hotel breakfast with lots of eggs and bacon and a little bread,
and only occasionally raclette, pasta, or pizza. Everything was
tasty.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>There was one man and one woman, and they worked
together extremely smoothly, handing off portions of the service with
complete synchronization. <a href="#fnref:1" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:2">
<p>Blockflöte. Why on earth do we call this perfectly good instrument with
such a stupid name? There was also an elderly man playing recorder beautifully
while waiting for a bus. <a href="#fnref:2" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
<li id="fn:3">
<p>I don’t have the original German anymore. Sorry. <a href="#fnref:3" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>
</ol>
</div>
Putting On Airshttps://www.thecodedmessage.com/posts/apartments/2019-06-10T00:00:00+00:00Julia liked Eric. She wasn’t in love with Eric, she didn’t fantasize about marrying him or idly think about what their children would be like, but she liked him, an appropriate amount for having met him only two times. Internet dating was strange to her, and she knew that dating took work. And besides, it was a good sign she was mature enough to not feel those goofier feelings yet. She would instead be, appropriately, cautiously yet earnestly excited.<p>Julia liked Eric. She wasn’t in love with Eric, she didn’t fantasize
about marrying him or idly think about what their children would be
like, but she liked him, an appropriate amount for having met him only
two times. Internet dating was strange to her, and she knew that dating
took work. And besides, it was a good sign she was mature enough to not
feel those goofier feelings yet. She would instead be, appropriately,
cautiously yet earnestly excited.</p>
<p>He had invited her to his apartment to cook for her (she had requested
burritos) and to watch a movie with her (she had requested Harry Potter). With
those choices, even if there was no proper “click,” she could still
ensure that the evening would at least be pleasant, and hopefully a more
relaxed atmosphere, with a bit of wine, would get them to have a smoother
rapport and a more natural chemistry. First dates are always awkward,
but Eric had potential!</p>
<p>Julia arrived at the Union Square train station, which was the stop for his
apartment. She used to work near this stop, but had never met anyone who
actually lived in this neighborhood before. Eric had a job in finance,
and so could afford such things. It must be nice! Julia wasn’t ready for
that yet. She still had roommates. She still felt like a child pretending
to be an adult — not that Eric would know that. Although, she supposed,
the Harry Potter might have been a hint.</p>
<p>The doorman was very tall, hunched over a little, hovering over
everyone. She went up, trying to look up the apartment number in her
text history, but the doorman preempted her: “You must be Julia. He’s
on the 9th floor. Have a nice day!” Julia blinked a few times and went
into the elevator, realizing afterwards that she probably should’ve said
something to him in response, wondering if it was common for people to
pre-register their guests like that.</p>
<p>It wouldn’t really surprise her if that were the case. She always felt
strange in luxury buildings. The hallways always made her feel more like
she was in a hotel than in a place where people live all the time. Every
time she entered into one, she felt like she would immediately be exposed
as an imposter, trying to pass as a yuppie without proper credentials,
like they had a magic passport or something. She supposed that passport
was money.</p>
<p>She eventually managed to find the apartment number on her phone — 9D
— and with slightly more work had also managed to find the apartment
door.</p>
<p>She rang the doorbell, and tried to clear her mind, but Eric was there
before she could get her thoughts properly in order. He ushered her in,
and the first thing she noticed was how clean it was. She tried to be
an organized person, and he was blowing her out of the water. To see
such cleanliness in a man was downright suspicious, and she said so.</p>
<p>Eric laughed and said, “The benefits of having a cleaning service!
It comes in handy, keeps things presentable. I suppose you’re less
impressed now that you know that this isn’t the real me.”</p>
<p>Julia blinked, and didn’t say anything.</p>
<p>“Here, sit down,” he said, sweeping his arm in a wide arc that seemed
to cover the entire apartment until finally it settled on a quaint,
tasteful wooden table. She pulled up a chair and sat, facing the kitchen
counter which had several plates with burrito fillings on them.</p>
<p>“Here we go,” he said, as he moved each plate, one at a time, over to
the table. “And time for plates, plates, plates…” He opened three
different cupboards, while continuing to mutter the word “plates,”
until finally the third one had a pile of plates in it, which he took
two of and also set on the table.</p>
<p>Julia smiled, and said, in a tone she thought was teasing, “Not sure
where your own plates are?”</p>
<p>“Yeah,” said Eric. “Cleaning lady reorganized my kitchen recently.
Much better-looking, downside is I don’t know where anything is anymore!”
He laughed nervously, and looked like he was sweating.</p>
<p>“Do you have any wine?” Julia asked.</p>
<p>“Oh, of course,” said Eric, and he darted into the living room
only to return a few seconds later back into the kitchen and open
up the first cupboard where he’d tried to find the dishes. “Here it
is! Now, bottle opener, bottle opener…” he continued to mutter
as he pulled open two different drawers.</p>
<p>Eric must really need his cleaning lady, Julia thought, and this
level of disorganization bordered on a problem. She had thought
he was a little bit absent-minded, but she hadn’t thought it
had reached this extent. Ah, well, she’d need something to
distract from it. “Can we get the movie going too? I’m ready
to watch as we eat.”</p>
<p>“Oh, of course,” said Eric, in exactly the same tone of voice, and walked
over to the TV. He grabbed a remote, pointed it at the TV, and nothing
happened. “That’s strange,” said Eric, staring at it. “Give me a minute.”</p>
<p>Eric tried different remotes in different combinations while
Julia realized she needed some utensil to put her burrito together.
She went to the drawer she thought she had seen utensils in before,
and saw a brochure.</p>
<p>“Temporary Apartment Service,” it said. “Don’t want to bring a date
or host a party in your place? Is it falling apart or too small
to turn around in? Host in our temporary, fully furnished
apartments! Meals also catered and pre-cooked on request.”</p>
<p>Julia blinked. She didn’t quite realize what it meant at first,
until she heard Eric saying, “Ah, I figured out the TV! Come on!”</p>
<p>“How long have you lived here?” Julia asked.</p>
<p>“About a year! What a great building, right?” responded Eric.</p>
<p>Julia remembered that he had described it as brand-new on her second date
with him, and as he again turned towards the TV and began to fiddle with
yet another setting, she picked her bag back up, went out the door, and
closed it behind her. When he got downstairs, the doorman smiled at her
and shook his head. “Eric went all out for the nicest unit this time,”
he said. “I always figure, you can’t fake your way through life. But then
again, I suppose everyone always is trying.”</p>
Operating Systems Part II: Modern Operating Systemshttps://www.thecodedmessage.com/posts/current_os/2019-05-26T00:00:00+00:00We use operating systems all the time in our life, whether designed for a computer, a phone, or for a server we’re more indirectly interacting with, but a lot of people don’t know very much about what connects the different systems we use, and what makes them distinct. We discussed fundamental concepts of operating systems in the last post, so in this post we will discuss how some of the same concepts apply to modern operating systems, going over them one at a time.<p>We use operating systems all the time in our life, whether designed for
a computer, a phone, or for a server we’re more indirectly interacting
with, but a lot of people don’t know very much about what connects
the different systems we use, and what makes them distinct. We
discussed fundamental concepts of operating systems in the <a href="https://www.thecodedmessage.com/posts/operating_system/">last
post</a>, so in this post we will discuss
how some of the same concepts apply to modern operating systems, going
over them one at a time.</p>
<h2 id="macos">macOS</h2>
<p>Unix moved on from controlling dumb terminals to having several
graphical user interfaces. When Steve Jobs was fired from Apple in
1985, he started a company called NeXT to develop NextSTEP, a version
of Unix with graphical user interface ideas, some from his work with
the Macintosh, some developed independently:</p>
<p><img src="https://www.thecodedmessage.com/images/next.png" alt="NeXT"></p>
<p>When Apple was struggling to bring its operating system into the modern
era, when Mac OS System 9 was still using cooperative multitasking,
Apple bought NeXT and brought Steve Jobs back into leadership to turn
NextSTEP into the next version of Mac OS, then called Mac OS X for
the Roman numeral 10. In spite of superficial similarities to previous
versions – the NeXT interface was changed to look more like previous
Mac OS systems – and application compatibility (which was bolted on
by running Mac OS System 9 as a single process within Mac OS X, which
shows how much more sophisticated Mac OS X really was), the new version
was completely different software descended from the original AT&T Unix.</p>
<p>It used to be common wisdom in some IT-savvy crowds (including a Best
Buy salesman in my hometown when Mac OS X first came out) to claim that
Mac OS X was a version of Linux, but this is not true. Linux is one of
many operating systems that come from the Unix tradition, and Mac OS
is a different one, sharing much of the Unix core instead with FreeBSD,
a much less common version of Unix descended from the version developed
at UC Berkeley (BSD stands for Berkeley Software Distribution).</p>
<p>For “desktop” computers, including laptops, macOS is now by far the
most installed brand-name Unix operating system, and even if you
include Linux in a broader category of Unix-like operating systems,
it still is the most popular one on the desktop.</p>
<p>This is in spite of the fact – or perhaps because of the fact – that
macOS doesn’t really emphasize its Unix “underpinnings.” Its graphical
user interface is proprietary to Apple, and there’s often macOS-specific
libraries that circumvent or supercede equivalent Unix ones, especially
when focusing on the GUI applications.</p>
<p>They also don’t invest a lot of resources into making their command line
interface friendly or powerful. Most Unixes make it easier to install
new applications and frameworks via command line, and the command line
is not particularly well-integrated with their graphical interface,
to the point where it sometimes seems like their GUI is next to Unix
rather than being built on Unix.</p>
<p>Finally, strangely for a Unix, Apple does not provide a server version
of its operating system, making it difficult for software developers for
Macs to be able to run server-side tasks like bulk automated testing on
the same environment as their workstation.</p>
<h2 id="ios-watchos-etc">iOS, watchOS, etc.</h2>
<p>iOS, watchOS, and their ilk are locked-down versions of macOS. Unlike
on macOS, each application is locked into its own directory and can
only access its own files, rather than being able to access any files
owned by the current user. The security features of Unix are applied
to isolate applications from each other rather than users, and the
user doesn’t really see the concept of the file system — instead,
each app simply remembers information for the user, and presents
how its organized in its own way.</p>
<p>Since only one application is visible at a time on many of these
devices, this gives it a feel similar to an old single-tasking
operating system, where each application is more its own universe.
Since they don’t visibly share a file system, the applications
also interact less with each other.</p>
<p>The most scary thing about these operating systems is that they’re set up
to protect the owner of the device “from themselves.” Only Apple-approved
applications can be installed unless you jailbreak the device, which
voids the warranty. Apple constantly lobbies for jailbreaking to be
made illegal, they claim for the users’ protection and to prevent users
from illegally copying apps, but also because they get a huge cut of all
sales done through iOS apps, which <a href="https://newsroom.spotify.com/2019-03-13/consumers-and-innovators-win-on-a-level-playing-field/">Spotify claims is against European
law</a>.</p>
<h3 id="open-source-and-linux-on-the-desktop">Open Source and Linux on the Desktop</h3>
<p>The <em>open source</em> movement, and its more opinionated cousin the <em>free software</em>
movement, believe, to various extents, that it is valuable for software to
be <em>open source</em> (or alternatively phrased <em>free as in speech</em>). This means that
anyone can read the source code to the software, the version of it
that is human readable and editable by actual programmers. It also means
that anyone can make modified versions of it, and publish them, usually
with different branding. Some open source/free software licenses require
those modified versions to also be open source, while others allow them
to be proprietary, but in all cases, the fundamental nature of open
source software is that anyone can make their own version (given
sufficient programmers and time).</p>
<p>Linux (sometimes called GNU/Linux because Linux technically only refers
to one part of the operating system, the <em>kernel</em>) is an open source
reimplementation of Unix. It organizes software in the same way that Unix
traditionally would, is written so that Unix programs can treat it as yet
another version of Unix (of which there were already many incompatible
versions), and follows the design of Unix function call by function call,
command by command.</p>
<p>Linux is a really big deal on the server, and as a component of the
Android operating system, as we’ll discuss later. It also is usable as
a desktop operating system in its own right. It inherited a graphical
user interface framework from Unix, known as the X Windowing System or
X Windows, and the open source movement inspired a lot of work writing
desktop environments within that framework, so that there could be an
entire modern desktop operating system that was open source.</p>
<p>Throughout the 90’s and 2000’s, many Linux enthusiasts would hope that
someday, a completely open source operating system could reach common
use. Articles would be written claiming this was immanent, to the point
where it became an easy-to-mock cliche: “This is the year of Linux on
the desktop!”</p>
<p>Ultimately, though many companies tried, no one succeeded in arranging
for it to be pre-installed on mainstream desktops or laptops nor in
polishing it enough to convince the normal user to install it over what
their computer came with. It is now a mostly-usable operating system,
should you choose to install it on your computer or buy a computer wiht it
pre-installed (which is an option some manufacturers now market towards
software developers). It is very well-suited for programming for reasons
we’ll discuss later, but still a bit awkward for things like setting up
Bluetooth or getting interesting features to work.</p>
<h2 id="windows-nt-xp-etc">Windows NT, XP, etc.</h2>
<p><img src="https://www.thecodedmessage.com/images/gates.jpg" alt="Bill Gates teaches how to count"></p>
<p>The history of Windows is intricate and arcane, and as a result, the Windows 10 of
today has virtually no code in common with the Windows 3.1 discussed above.
Similar to macOS, the Windows brand at some point was switched out with a
better operating system implementation, although in Windows’s case, that
implementation came from Microsoft’s “workstation” or “business” version,
Windows NT.</p>
<p>Windows NT first came out shortly after Windows 3.1, and to avoid having a
Windows NT 1.0, which might sound less sophisticated than the existing
Windows 3.1, the very first version of Windows NT was called Windows
NT 3.1. It was based off of OS/2, a failed collaboration between Microsoft
and IBM to render MS-DOS obsolete, and it did not boot off of MS-DOS
nor use MS-DOS as a layer.</p>
<p>Windows NT was designed from the beginning to support programs designed
for other operating systems. For more sophisticated operating systems,
programs have to go through the operating system to access hardware,
by invoking procedures that invoke operating system code, and different
operating systems provide different procedures. Based on what program
you were running, Windows NT could support many sets of procedures
(also known as APIs, but distinct from what API means on the web),
which it called <em>personalities</em>.</p>
<p>Windows NT had from the get-go a personality to support Windows 3.1
versions, a 32-bit personality to support new Windows NT programs,
and a personality to support MS-DOS (which involved much more
machinery to give the program the illusion of more direct hardware
access). It also originally came with personalities for Unix and
OS/2, which eventually were removed.</p>
<p>As Windows NT supported traditional Windows programs as a personality,
Windows and Windows NT co-existed for a long time. Windows 95, 98, and
Millenium were versions of Windows that still used MS-DOS as part of their
structure and which did not attempt strong security or rigor (though they
did adopt preemptive multitasking), while Windows NT 4.0 and Windows 2000
(aka NT 5.0) were versions of Microsoft’s more sophisticated operating
system, that could more or less run the same programs but focused on
stability and workplace use (with the presumption of professional IT
people), rather than Microsoft’s maniacal obsession with application
support and its easy-to-use brand.</p>
<p>Eventually, in Windows XP, they made the switch. They risked worse
compatibility with really old applications (after all, the operating
system was completely switched out under the hood) in order to push
everyone towards their more modern operating system. Windows XP
was internally Windows NT 5.1 (and remember that Windows NT 3.1 was the first
one because it borrowed its number from the other OS called Windows),
and it replaced Windows 98 and Millenium as Microsoft’s flagship
consumer OS.</p>
<p>Now, they don’t have to maintain two completely different operating
systems anymore. Their server OSes are still distributed separately,
but that is mostly for licensing and configuration reasons – it’s
the same fundamental OS with different features enabled and different
auxiliary programs shipped. All in all, Microsoft has a simpler
tech architecture now that they’ve pushed everyone towards NT.</p>
<p>This is a good place to clear up a common misnomer: the Windows command
line, in modern NT-based Windows, is not a version of MS-DOS. It is only
related to MS-DOS aesthetically: It has a similar look to the prompt (<code>C:\></code>,
<code>C:\WINDOWS\></code>), and similar commands to do similar things (<code>dir</code> to list files
instead of Unix’s <code>ls</code>). It is simply the Windows command line.</p>
<p>Furthermore, support for MS-DOS binary compatibility was finally dropped
with the transition to 64-bit computing, not because Microsoft wanted to,
but because that would require a processor mode that AMD (and therefore
Intel) decided not to support in their hardware.</p>
<p>You can’t, on the AMD64/Intel64 platform, have a 64-bit operating system
and a “virtual 8086” mode process, where the processor would have to
pretend to give you full control over the computer and pretends to be
an ancient MS-DOS-era computer while also giving final say to the real
64-bit operating system. Intel32 supported this for 32-bit OSes and
16-bit MS-DOS compatibility, but I suppose the processor manufacturers
thought the 64-bit vs 16-bit compatibility bridge was just a bit too far.</p>
<h4 id="microsoft-windowss-monopolistic-market-dominance-and-the-open-source-movement">Microsoft Windows’s Monopolistic Market Dominance and the Open Source Movement</h4>
<p>In the 90’s and 2000’s, Microsoft had a lot of power through Windows.
It constituted a monopoly on consumer operating systems, and people were
scared to run other operating systems, because application compatibility
was a big deal. Only major application vendors had the resources to
support two operating systems (which was much harder in those days),
and so having a different operating system (especially an ill-supported
open source operating system like Linux) could cut you off from the rest
of the computing world.</p>
<p>Microsoft used this power to control the application market,
because any application it bundled with the operating system would
drive any competitor out of business. It did this any time
it thought an application was interesting, including writing
its own web browser that drove Netscape out of business, finally
attracting a lawsuit that almost split Microsoft into multiple
companies. When that didn’t happen, it looked bad for the
computer industry.</p>
<p>Microsoft also had corrupt relationships with computer manufacturers.
Deals were signed where the hardware vendors would have to exclusively
install Microsoft Windows on their computers, or else pay Microsoft
based on how many total computers they sold rather than how many
came with Windows. This meant that Microsoft didn’t actually have
to improve Windows to compete; they could just rest on their laurels
due to their shrewd and blatently illegal business dealings.</p>
<p>At that time, it seemed like the only way to break Microsoft’s
competitive hold was compatible, open source alternative versions
of everything. OpenOffice was written to try to be an alternative to
Microsoft Office, but it was a non-starter unless it could read and write
Microsoft’s proprietary Office file formats. Similarly, Mozilla Firefox,
the first web browser to erode Internet Explorer’s hold on the web,
only worked on many sites because it used to be configured by default to
tell web servers that it was Internet Explorer rather than identifying
itself honestly.</p>
<p>The crown jewel of this effort would have been working compatibility with
Windows programs on another operating system — at the time, that was
often seen as the only hope for breaking Microsoft’s monopoly on operating
systems. Two efforts co-existed in that regard, Wine and ReactOS.</p>
<p>Wine was the more serious effort, which would have allowed Windows
programs to run unmodified on Linux, including Microsoft Office,
which was the only program that could perfectly read Microsoft Office
documents. Wine would provide Windows applications with a personality,
like Windows NT had, where they could call Windows’s library functions,
and have them translated into the equivalent series of Linux library
function calls to get their work done.</p>
<p>ReactOS was fascinating to me at the time because it attempted a
complete open source reimplementation of Windows NT. Programs
running on ReactOS would act like programs running on Windows
because the operating system was designed from the beginning
to act like Windows.</p>
<p>Neither of these projects gained enough stability to be used in any
production setting. What ultimately lessened Microsoft’s stranglehold
on power was the fact that nowadays, it’s not really relevant for most
applications what operating system you use, because applications have
transitioned to the web for deployment.</p>
<p>Nowadays, when you want to do something new with your desktop or laptop
computer, you don’t install a new application (although interestingly,
you still do with your phone). Instead, for the most part, you go to
a website, whether for matchmaking services, communicating with people
through many different means of communication, or ordering food to your
apartment. The local program you buy at a store, or even download over
the Internet, has been obsoleted by just going to a website, where you
don’t even need to install anything. And as a result, Microsoft’s biggest
stranglehold was eroded from a direction they barely expected.</p>
<p>They tried to hold on, as long as they could, by making their web browser,
Internet Explorer the standard web browser, and encouraging websites
to use Internet Explorer specific features. Eventually, Firefox was
compatible enough with Internet Explorer to break through that monopoly
and force Microsoft to update its browser, which led to the current
situation — where Chrome is becoming the new monopolistic web browser
and now it is Google that is close to single-handedly controlling our
primary platform for deploying applications.</p>
<h2 id="linux-and-unix-on-the-server">Linux and Unix on the Server</h2>
<p>I mentioned before that Linux and macOS were both popular among
developers. Linux certainly allows a lot of customization, and you could
see how that would be appealing for advanced users like many developers
are – but that doesn’t really explain the popularity of macOS, which
is the opposite.</p>
<p>Really, Linux and macOS are popular among developers because they are
Unixes. Unix — and Linux, which is now basically the best Unix for most
tasks — never waned in popularity in the minicomputer space, which
evolved into the server space. When you are running a server, having
a powerful (and programmable) command line is a huge plus, and not having a
smooth GUI experience or drivers for every consumer device is a non-issue.
Linux is the <em>de facto</em> standard for server operating systems now,
and when developing applications to run on the server (like the server
side components of any web application, including Facebook, Twitter,
GMail, and more or less any you can think of), it is useful to have
a match between what you run on the server and what you run on
your personal computer.</p>
<p>macOS provides a close enough match to Linux servers to be useful
for development. Most Linux software also runs on macOS, because of
their shared Unix heritage and continuing efforts to keep compatibility.
The compatibility isn’t perfect, and many programmers like the flexibility
that comes with Linux (and don’t mind the inconvenience), and so Linux
is also popular among developers as a client OS.</p>
<p>Windows is actually actively trying to catch up with macOS in this
domain; it has introduced the Windows Subsystem for Linux, an NT-based
personality that allows Windows to run Linux programs, unmodified. This
is an impressive technology marketed at devleopers and used for practical
applications by many people I know.</p>
<h4 id="what-is-a-server">What is a server?</h4>
<p>What does a server do? It waits for incoming connections from other
servers and from client computers like your laptop or phone, and responds
to requests. It stores your data in databases and file systems, and
does the heavy lifting that needs to be done by a more powerful computer
than you really need to have in your own home. We interact with servers
every time we use a web browser or an e-mail client, and most phone apps
and games have a server-side component — certainly if they involve
coordination with other people and other phones!</p>
<p>As the “cloud” grows as a concept, more and more of our computing is
done on servers owned by big companies. We store our documents and
spreadsheets on Google Drive, keep our contact information on iCloud,
or let our photos be saved on Instagram. All of these services use
Linux to power the servers that actually store the data and provide
it to us in an organized and secure way.</p>
<h3 id="android">Android</h3>
<p>As mentioned earlier, Linux is technically only one component of the
operating system called Linux (or rather the family of operating systems,
because many companies and organizations leverage its open source
nature and distribute their own Linux-based operating systems, and there
is no one official complete distribution), namely, the <em>kernel</em>.
The kernel is the portion of the operating system that runs in a privileged
mode on the processor, which forces the applications to go through it
rather than access the hardware directly (as on MS-DOS).</p>
<p>Android uses the Linux kernel — but nothing else from the operating
system commonly called Linux. Like iOS, it uses its kernel in an
idiosyncratic, locked-down way — not quite as locked-down as iOS,
but much more locked down nevertheless than any desktop operating system.</p>
<p>Android is open source, but you need to pay Google to use their app
store and standard apps and brand. Off-brand Android can only be used in
practice by companies rich and powerful enough to build out their own app
store, like Amazon. Being able to run Android apps would be a relatively
easy way for another mobile OS to gain a pre-existing developer base.</p>
<h2 id="chromeos">ChromeOS</h2>
<p>And Google somehow, after writing Android, wanted yet another
Linux-based operating system. ChromeOS, popular in American public
schools like Mac OS was in my school days, is exactly what it sounds
like: a laptop operating system where you just run Google Chrome. With so many
apps in the browser anyway, what’s the downside?</p>
<p>In a ChromeOS context, from a user’s point of view, you begin to wonder what
the difference is between a browser and an operating system, really. An operating
system lets you run multiple applications — but now those are just different
browser tabs. Who cares whether the Linux kernel or Chrome itself are the
pieces of software that separate the applications from each other — from the
user’s perspective, it’s all the same.</p>
<p>If you unlock the developer mode, you get a somewhat dumb version of “Linux
on the desktop,” with a Linux command line interface. This is convenient
for people who only want to use the web and log into remote servers,
which is a surprisingly large demographic.</p>
Music and Lyricshttps://www.thecodedmessage.com/posts/music_words/2019-05-12T00:00:00+00:00I just finished singing Beethoven’s Missa Solemnis in a concert as a member of the Grace Church Choral Society, and it was the most technically difficult piece I have ever sung in a choir. It was a single piece of concert length, a mass setting, as is custom for our spring concerts. It was all in one language: in this case, in Latin. This is different from our holiday concerts in the winter, where we sing a variety of Christmas-y and otherwise celebratory works in a variety of (European, Christian) languages, including English.<p><img src="https://www.thecodedmessage.com/images/choral_society.jpg" alt="Grace Church Choral Society"></p>
<p>I just finished singing Beethoven’s Missa Solemnis
in a concert as a member of the <a href="http://www.thechoralsociety.org/">Grace Church Choral
Society</a>, and it was
the most technically difficult piece I have ever sung in
a choir. It was a single piece of concert length, a <a href="https://en.wikipedia.org/wiki/Mass_%28music%29">mass
setting</a>,
as is custom for our spring concerts. It was all in one language: in this case,
in Latin. This is different from our holiday concerts
in the winter, where we sing a variety of Christmas-y and otherwise
celebratory works in a variety of (European, Christian) languages, including
English.</p>
<p>Now, I can translate every word of the <em>ordinary</em> of the mass, which is
the term for the hymns that are sung at every mass, as opposed to those
that are <em>proper</em> to a particular occasion. This is partially due to the
Latin classes I took in high school and college, and also partially due
to the fact that the same set of texts have been set to myriad different
musical settings by many composers, but primarily due to the fact that
the same prayers are used in even modern English-language recensions of
the Western Rite, among not only Roman Catholics but other liturgical
western churches, so I’m actually singing hymns that I’ve sung or said
my entire life in English.</p>
<p>Given this, I was a little surprised when a dear friend of mine was
disappointed to learn that this concert would have no components in
English. I pointed out that the translations would be available in the
program, but this didn’t really change her opinion. I was a bit
taken a back by this opinion: I had felt at the time
that the point is the music, not the words, and that can be gotten
without understanding the words.</p>
<p>This was an unfair position for me to take, since I do understand the
words, and thinking about the meaning of the words, I later realized,
was a key part of my experience singing in the concert and in rehearsals,
to the point where I often am tempted to cross myself at those points
where, liturgically, I would cross myself if it were an actual mass.
Furthermore, the meanings of the words is something we discuss at the
rehearsals: the word “descendit” — meaning “he came down” — is set
by Beethoven to descending intervals, and the word “ascendit” — meaning
“he went up” or “he ascended” — is set to ascending scales.</p>
<p>My reaction to this is usually that it’s good music and good words
but this literal alignment of words to musical phrases is a bit trite
and overwrought. I say “usually” because our director John Maclay did
something in one rehearsal for this concert which has changed my mind about
this: he sang a section of the piece in English translation. “For us
humans, and for our salvation” he sang, and all the sudden it felt not
trite but completely poetic and integrated and fitting that the words
match the vibe of the music. “He came down,” he sang, and the descending
interval brought out the meaning of the text and the text gave context to
the music. “From heaven,” he sang, and the triumphant high notes matched
the words so well I felt like I could see the angels themselves in their
perpetual heavenly worship.</p>
<p>I was flabbergasted. These motifs that I remember having dismissed as
trite suddenly seemed deep and fitting when I heard them in my own
native language. And I realized that if I were more skilled and
practiced in Latin — as Beethoven and his contemporaries certainly
were — I would have felt similarly without any such device. No
wonder my friend prefers the songs in English! It’s not because
she’s not paying attention to the music, but because they’re supposed
to go together as an artistic whole. I realized this before, but
had not as fully appreciated it until we had done this exercise.</p>
<p>For the rest of the rehearsals and for the concert, I tried to
not only think about the meaning of the words, but imagine how
they would sound to me if they were sung in English.</p>
<p>And from this I got a number of insights. I realized why for
one portion of the “Credo,” an <a href="https://en.wikipedia.org/wiki/Nicene_Creed">ancient recitation of Christian
beliefs</a>, the word “credo”
— “I believe” — was sung repeatedly in the background while a list
of beliefs was sung quickly. Instead of meditating on each individual
belief, the effect was something along the lines of “look at all these
dogmas I also acknowledge,” an appropriate measure for the section that
mostly discussed the church and her rites, giving, at least in my view,
the message of “I believe these things not because I fully understand
them but because the church says so, and I believe them fervently.” It
was a type of emphasis that seemed to underscore and lend earnestness
to the text, to an extent that I imagine might have made non-Christian
members of audience feel uncomfortable if it were in English.</p>
<p>These insights might perhaps seem obvious to anyone who has enough Latin
to know what’s going on and enough knowledge of history to know what
Christianity and music was like in Beethoven’s time, but they
were new to me, and they seemed profound. So I understand now
better why some of my friends prefer more of the concert in English,
and realize that, even though I know enough Latin that I could translate
the whole concert for you, even I would benefit from singing in a
language I had an actual fluency in. No wonder the Protestant Reformers
were so interested in having church in the local vernacular!</p>
What is an operating system?https://www.thecodedmessage.com/posts/operating_system/2019-04-28T00:00:00+00:00A user of modern technology hears the term “operating system” thrown around a lot. Most people can name a few examples: Windows and macOS on workstations and laptops, iOS and Android on phones. Some people might even throw in Linux or Unix or ChromeOS. Most people also understand that a program or a game or even a sufficiently advanced website might work on some operating systems but not others, and might require different versions for different operating systems.<p>A user of modern technology hears the term “operating system” thrown
around a lot. Most people can name a few examples: Windows and macOS on
workstations and laptops, iOS and Android on phones. Some people might
even throw in Linux or Unix or ChromeOS. Most people also understand
that a program or a game or even a sufficiently advanced website might
work on some operating systems but not others, and might require different
versions for different operating systems. But it’s a bit less clear what
an operating system actually is, how it fits into the general model of
a computer, and how it works.</p>
<p>This isn’t surprising, because “operating system” is a bit of an
amorphous concept. Is it a type of program? It’s certainly different
from most programs we think of!</p>
<p>It wasn’t my idea to ask this question. I listened to a talk
recently by the lead programmer on a project to develop a new operating
system, and he spent at least the first quarter of the lecture and many
slides trying to come up with a workable definition that jived well with
most programmers’ and users’ intuitions. [Edited to add: It was Bryan
Cantrill, who brings this up in multiple talks. I am unsure which one
inspired this.]</p>
<p>But now that I’ve heard the question posed, I feel compelled to try to
answer it. So, to explore this concept, I’m going to talk about a lot
of operating systems from history. These aren’t going to be the operating
systems that invented the models in question, but rather typical examples
of those models, especially very popular operating systems of their era
and ones that were direct predecessors to popular operating systems today.
All of the fundamental technologies discussed pre-date the operating
systems I discuss to typify them.</p>
<h2 id="computers-without-operating-systems">Computers Without Operating Systems</h2>
<p>To see what an operating system is, and why we might want one, let’s
imagine a computer without an operating system, or perhaps with a very
minimal operating system. Such computers once existed; people my age
or older might remember the Apple II or the Sega Genesis. A more recent
example might include earlier versions of the Game Boy. These computers
(and a game console is a type of a computer for these purposes) could only
run one program at a time; if you wanted to run a different program or
game, you had to turn the device off, insert a new floppy or cartridge,
and turn the device back on again.</p>
<p>The same physical machine took on an entirely different interface based
on what software you provided. Each program has full control of the
computer while you’re running it, to the extent that you have to turn
the computer off to stop running the program. Each program also managed
its own storage; you would save your Sega Genesis games on the cartridge,
not the console, and could then resume them on your neighbor’s console
if you wanted to.</p>
<p>This is very different from how computers with operating systems work,
and leads me to the following definition of an operating system: an
operating system is a set of software that allows multiple programs to
co-exist on a computer. You need an operating system to, for example,
reasonably have a permanent hard disk, because there needs to be some
or another convention as to tell which programs should write their data
to which portions of the disk.</p>
<h2 id="a-minimal-operating-system-ms-dos">A Minimal Operating System: MS-DOS</h2>
<p><img src="https://www.thecodedmessage.com/images/Msdos-icon.png" alt="MS-DOS icon"></p>
<p>This definition includes older operating systems like MS-DOS (see the
<a href="https://github.com/microsoft/ms-dos">original source code</a>), Microsoft’s
flagship operating system from the 80’s and early 90’s. MS-DOS only could
run one application at a time, like the Apple II or the Sega Genesis. The
difference is that MS-DOS would at least let you share a hard disk between
applications and it also let you switch which application you were using
without rebooting or inserting new media. Sharing a hard disk between
programs was its defining feature, to the point where DOS actually stands
for “disk operating system.” MS-DOS shared this acronym DOS with other,
similarly featured microcomptuer operating systems of its day, which
also focused on simply letting programs share a hard drive.</p>
<p>To share a hard drive between multiple programs over time, all the
programs have to agree on how the hard drive is organized.
It wouldn’t do for a game to store its game data on sector 13 of
the hard drive when a word processing editor wanted to store its
list of documents on the same sector. The hard drive required
not only an organization scheme, but one shared between different
programs by different authors.</p>
<p>This was done through a <em>file system</em>, which allowed you to assign names
to long blobs of bytes, called <em>files</em>. A programmer could have a program
store whatever it wanted in the files it created, but as long as it
created files with different names from the other programs, the operating
system, with its file system, would ensure that the data could be
found again without each program having to have its own, possibly
conflicting, ideas of where to look directly on the disk.</p>
<p>On MS-DOS, these files had to be 12 characters long or less:
8 characters of name, a dot <code>.</code>, and an 3-character <em>extension</em>,
for example, <code>teleport.doc</code> or <code>taxr1998.xls</code>. The extension served
as a convention to indicate which program was supposed to care about
this file. Your spreadsheet program would let you save spreadsheets on
the same file system that your word processor would let you save your
documents — some mechanism was needed to say which program should be
run to make sense of which blob of binary bytes, especially because
the first version of MS-DOS didn’t even have support for directories
(which we now might call folders).</p>
<p>If you opened a file with the wrong program, the program might notice
you used the wrong extension — or it might not, and give you gibberish
results from misinterpreting the data. It would certainly encourage you
to save files with the proper extension — a concept that survives in
Windows to this day, where programs only offer to open files that have
an appropriate extension.</p>
<p>By modern standards, MS-DOS and its file system didn’t do very much.
It didn’t stop a program from modifying files intended for another
program — or even from wiping the computer entirely; it simply created
an organizational system that allowed programs to co-exist and store their
data in an organized fashion, as long as the program’s were well-behaved
and not buggy (or malicious).</p>
<p>It did have to define a format for programs themselves to be stored on
the disk. You could tell which files represented runnable programs
because they had the extension <code>com</code> (for “command”) or <code>exe</code> (for
executable). It also had to provide a program to launch your application
programs: This was known as a <em>shell</em>: It was the first program that
ran when you turned on the computer, and you could use it to select
other programs to run. At the time through a <em>command-line interface</em>:
It would prompt you with the text <code>C:\></code>, and you would have to type
the name of the file that contained the program you wanted to load
(or alternatively do some very basic file management directly from the
command line through built-in commands).</p>
<p><img src="https://www.thecodedmessage.com/images/StartingMsdos.png" alt="Starting MS-DOS&hellip;"></p>
<p>Besides its core mission of providing a system to operate a disk, the
“disk operating system” did also have other code, to help programs
interact with the hardware. As most components besides the disk could
be used by the programs however they wanted without damaging others
(because only one ran at a time), this code wasn’t as essential to its
functionality, but it did exist. Software used to interact with hardware
is called <em>drivers</em>, and they might be included in an operating system
or might be loaded separately, depending on the design. Driver code
is organized into procedures that programs invoke to do things to the
hardware (e.g. draw on the screen or print a file), or code that is
installed as <em>interrupt handlers</em> so that the processor will interrupt
the current task whenever a certain hardware event happens (e.g., what
to do when the user presses a key). Because MS-DOS was so minimal,
both types of drivers could be circumvented.</p>
<p>And in actuality, application programs could circumvent the driver that
was the most core to its role as a “disk” operating system — the driver
for the hard drive, and the layer that allowed you to edit it in terms
of files. MS-DOS couldn’t even force programs to use its procedures for
the one abstraction it absolutely had to maintain. Though the existence
of official filesystem procedures provided some stability, many programs
circumvented these procedures and modified the hard disk directly,
(hopefully) making sure to respect the conventions but not using MS-DOS’s
actual code. MS-DOS, especially at first, was a little bit of code, and a
lot of “gentlemen’s agreement” — it had no security or rigor whatsoever.</p>
<p>This had some upsides. Every application had access to the full power of
the computer. Microcomputers were much slower then, and so every ounce
of direct hardware access could be a major performance boon, especially
for games. Furthermore, many applications supported hardware that the
operating system itself could not: In MS-DOS days, you often had to do
separate sound card or even graphics configuration for every game you had,
but at least you weren’t limited by what Microsoft had chosen to provide
support for.</p>
<p>It also had some downsides. Obviously, securing your files was impossible:
there was a way to mark files as read-only, but it could only be advisory.
There was no system of multiuser file ownership — though an application
could individually provide an encryption feature. These downsides weren’t
too bad — if you trusted everyone who used your computer, it wasn’t
really a problem. It’s generally better anyway to secure your computer
with encryption or just by putting it in a locked room.</p>
<p>More importantly, this was a hazard for the stability of the system. Any
program could decide to circumvent the standard ways of doing file access,
and many did, to cut corners on performance. But many different pieces
of code all interacting with the same file system is many opportunities
to mess up and have bugs instead of just one. There was a real risk of
a poorly-written program corrupting your file system, deleting files
it wasn’t even supposed to touch or potentially rendering the entire
filesystem unusable.</p>
<p>The biggest long-term problem for Microsoft was a subtler version of
this: If Microsoft wanted to change the file system — if they, for
example, wanted to make filenames longer than 8.3 (so you could say
<code>real_long_name.html</code> instead of <code>rllngnam.htm</code>), they couldn’t just
go do it themselves. Changing a bit of code is easy. Changing a subtle
gentlemen’s agreement requires all the gentlemen in question to agree.
If they had changed the format to allow more characters, programs that
used their officially recognized libraries would keep working, but
those that accessed the file system on the hard drive directly would
be following the old ways when the conventions had changed. They would
be thrown off by the long filenames like old people thrown off by how
young people dress. The software that followed the old conventions could
easily accidentally delete data that no longer follows them.</p>
<p>If this were just an occasional program that was doing things its own
way, then Microsoft could just break that one program. Unfortunately,
many many programs had their own ways of accessing the disk. The “disk
operating system” couldn’t even keep control of its central feature.</p>
<p>The other major downside of MS-DOS and OSes like it is that you couldn’t
run multiple programs at the same time. It allowed different programs to
run in sequence, and to share permanent resources (the filesystem). On
a modern operating system we take for granted the ability to multitask
programs. We listen to music while being ready to receive a call at any
moment — and to return to the music when the call is finished. We
expect to be able to look up directions or text messages while talking
to our friends while a file is downloading in the background. This takes
much more sophistication than MS-DOS could provide.</p>
<p>Luckily for those who wanted multitasking, many systems existed to
add multitasking to an MS-DOS installation. Because MS-DOS was so
minimalistic, an MS-DOS program took full control of the computer
when it was run. If it used that control to dispatch between multiple,
simultaneously running programs, it fits our definition of an operating
system: a software system that allows multiple programs to coexist on
a computer. Basically, operating systems existed that used DOS as their
launching point, taking over the computer and providing richer and more
modern services to the programs running under its scope.</p>
<p>These programs/OSes were called “DOS extenders,” and the most famous
of them was written by Microsoft, DOS’s vendor, to add multitasking
(and GUI, which in the personal computer world often went hand in hand)
to their otherwise primitive operating system. This was called “Windows.”</p>
<p><img src="https://www.thecodedmessage.com/images/windows.png" alt="Microsoft Windows has changed a lot since 3.0"></p>
<p>For those of you who don’t remember this era, Windows was not always
the operating system a computer would immediately boot into. It used to
be that Windows masqueraded as a MS-DOS program, that you’d boot up the
computer and see a command-line prompt, and have to type <code>win</code> before
you saw any graphical user interface whatsoever. Without a preexisting
MS-DOS installation to set up the file system and do initial hardware
configuration, you couldn’t run Windows at all — not that Windows wasn’t
sophisticated enough, but it had always been run that way, and so it never
replicated that functionality in the boot process. Similarly, Windows at
the time was constrained, just as DOS was, by its 8.3 filename convention.
It had to share a filesystem with DOS programs, as it was itself a DOS
program — as well as an operating system in its own right.</p>
<p>By the time Windows had gotten to version 3, it had the ability, on
sufficiently powerful computers, to run multiple copies of MS-DOS at
the same time and an MS-DOS program in each of those copies — and yet,
at another layer of abstraction, it was itself a program run from the
one copy of MS-DOS that your program booted. Microsoft cleaned up this
situation in Windows 95, which still used DOS internally as part of
its boot process, but went straight to graphical, Windows mode when the
computer turned on.</p>
<h2 id="cooperative-multitasking">Cooperative Multitasking</h2>
<p>Windows 3 supported graphical user interfaces and running multiple
programs at the same time, and so did Mac OS System 7, both from
the early 1990’s. However, multiple programs did not, and could not,
literally run at the same time — the processor executed instructions
in a stream and that stream of instructions represented only one
program at a time.</p>
<p>To maintain the illusion of running multiple programs at the same
time, these systems used <em>cooperative multitasking</em>. In cooperative
multitasking a program runs for a short amount of time, and then it
is expected to <em>yield</em> control of the processor back to the
operating system.</p>
<p>In a graphical user interface, this usually corresponded to an event
of some sort. When the user clicked in some window, the program
that owned the window would get to run for enough time to decide
how to respond to it: what internal memory should it update,
what should it write to the hard drive, and what new things should
it display on the screen. Once it was done handling the event,
it would return to the operating system, which would then
see if the user has clicked a key in the meantime, which might
mean sending an event to another program. The program could
also, however — maliciously or accidentally — not return
to the operating system, in which case the computer would simply
hang and refuse to respond to more input. This is why operating
systems of that time would regularly freeze completely in the
presence of a poorly-written program.</p>
<p>The memory of all the programs were loaded in memory at the same time,
and there was nothing protecting one program’s internal data from being
overwritten, maliciously or accidentally, by another program. Basically,
the different programs could be thought of, in a modern sense, as
collections of loadable event-handling subroutines for one graphical
interface system. They were kept separate again by convention,
by gentlemen’s agreement.</p>
<p>For certain background tasks, like playing music, the code to
keep sending data to the speakers has to be run repeatedly, on a
timer — so any apps that use that feature can crash the computer
at any time by simply failing to complete.</p>
<p>So while these operating systems were more sophisticated than MS-DOS
and its cohorts, in another sense they promised more than they could
deliver, and relied even more on the good behavior of the programs
they managed.</p>
<p>They allowed multiple programs to run simultaneously, but actually
required more out of the individual programs to have a harmonious
system. After all, if an MS-DOS program crashes, the computer could be
rebooted, but at least you only lost your work in that program. If a
Windows 3.1 or Mac OS System 7 program were to crash, you’d lose work
in all the other programs it was “multitasking” with.</p>
<p>By this point, there were stronger protections against a program
circumventing the operating system with its own drivers. It was still
generally possible, but less likely to be done. This is important, because
while in MS-DOS, it makes perfect sense for each program to define what
happens when you click the mouse, on a graphical system, the mouse has
to control a mouse pointer which moves from window to window and acts
the same whichever application is in the foreground. When more than
one application runs at a time, more hardware becomes shared resources,
and so the operating system must take on responsibility for it,
even if this responsibility is only carried out cooperatively.</p>
<h2 id="time-sharing-or-preemptive-multitasking">Time-Sharing or Preemptive Multitasking</h2>
<p>Windows wasn’t Microsoft’s first attempt at a more robust operating
system than MS-DOS. For a while, it tried to market a more
sophisticated version of MS-DOS, still command-line centric, but
without many of the deficits we’ve discussed. This operating system
was Xenix.</p>
<p>Xenix was Microsoft’s entry into a longer, older tradition of the Unix
operating system. This tradition is mostly present today in Unix’s
off-brand workalike clone, Linux. It is from the world of minicomputers,
which is what we used to call what we now call server-class computers,
from before the primary use of them was to provide centralized infrastructure
for other “client” computers.</p>
<p>Before any of the other operating systems we’ve discussed, Unix was
developed at Bell Labs for minicomputers (see the <a href="https://github.com/dspinellis/unix-history-repo">original source
code</a>. Don’t let the name
fool you — they’re named because they’re the size of a refrigerator
rather than the size of a warehouse room like a mainframe. It ran on a
single computer that had multiple <em>dumb terminals</em> connected to it, which
means that there was a non-computer device that the user would sit at,
and use a command-line interface to interact, over the phone or some other
connection, with a centralized computer that was shared with other users.</p>
<p>In such an environment, the laxness of MS-DOS or Windows 3.1 was simply
unacceptable. While security against malicious users was not necessarily
important, depending on your user-base, there needed to be some level
of robustness against ill-behaved programs, especially as at the time,
most computer users would regularly write new programs that could easily
behave poorly, as they were still being developed.</p>
<p>More importantly, programs would often have to bulk-process data. On the
spectrum of “consumer interaction” to “serious work,” these early minicomputers
were very much on the side of “serious work” in their common use cases. You
might leave a program running for hours as it processed a large bulk of
data. You didn’t want to have to worry about letting other users’ programs
get a chance to run — at the very least, you didn’t want to have to
put active effort into making it possible. It would be inconvenient.</p>
<p>On the hardware side, these computers’ processors, like processors
on microcomputers (as personal desktop and laptop computers were once
called), processed one series of instructions at a time. Something had
to be done to give each of the users the illusion that they were the
only one running their tasks on the computer.</p>
<p>If a process — meaning a currently active instance of a user running a
program — was waiting for more data, because it had requested a read
from the operating system (which mediated all reads from files or any
terminal), it was similar to the cooperative situation: the operating
system would suspend or <em>block</em> the execution of the current process,
and schedule it again when the read had completed, perhaps in response
to a terminal user hitting the <code>[Enter]</code> key.</p>
<p>But there could be long gaps between when a process would enter
into a <em>blocked</em> state like this. A user could try to calculate
a million digits of Pi. On Mac OS System 7, some sort of
yield function would have to be called from time to time, to give other events
a chance to be handled, but ideally we don’t want that complexity to
be passed onto the application programmer.</p>
<p>Instead, before letting a process run on the processor, the operating
system will first set a timer in the hardware. When the timer goes
off, it will cause a <em>timer interrupt</em>, where the processor will stop
what it’s doing and run an operating system procedure instead. That
operating system procedure will suspend the currently running process,
using features of the processor to make it so that when the process is
resumed, it is almost impossible for the user — or even for the program
— to detect that it had ever been interrupted.</p>
<p>In that case, while we hope that only one user is running a complicated
task at a time, even when multiple are, their long-running tasks simply
split the processor 50/50 — or in some other proportion deemed fair
by the system’s <em>scheduler</em>.</p>
<p>For every purpose but speed, however, the user has the illusion that
they’re the only one using the computer, although in fact many users
might be using it at the same time. Just as sharing a disk was the
primary feature of MS-DOS, splitting processor time was the primary
feature of Unix, as evidenced by its original full name, the
“Unix Time-Sharing System.”</p>
<p>Time sharing was often, but not always, paired with <em>memory protection</em>,
the idea that a process was limited in what memory it could modify, and
isolated from other processes. This was a feature that most minicomputers
had, but that it took a longer time to mature on microcomputers. This
feature usually goes hand-in-hand with a mechanism to force programs to
interact with hardware through the operating system, which also requires
hardware support, known in the Intel universe — appropriately — as
<em>protected mode</em>. MS-DOS did not run in protected mode. Windows 3 could.
Windows 95 always did.</p>
<p>There were other time-sharing systems of that time, but Unix was
one of the most famous, partially because it has survived in
continuous evolution to this day. Its off-brand open source clone,
Linux, is the most popular OS for servers as well as part of the
Android operating system for mobile devices. One of the more popular
workstation operating systems, macOS, is nowadays also a fully licensed
brand-name Unix.</p>
<p>I bring up Unix to show that time-sharing features pre-date MS-DOS
and much of the microcomputer era. They were considered overkill for
microcomputers while they were still underpowered, but they existed in
other contexts. At the time, the focus was more on supporting multiple
simultaneous users — the fact that a single user might be able to
run multiple processes at once was a minor side benefit. After all,
these systems were mostly command-line based, and it was only possible
for a user to interact with one process at a time (per terminal), so
besides background computation (which some users did really care about),
it didn’t have the same immediate practical use as being able to edit
your Word document while playing music.</p>
<p>So why did cooperative operating systems ever exist, if Unix predates
Windows 3.1 and MacOS System 7? Well, they existed in different
domains. Preemptive multitasking was difficult to program, and was
mostly available on operating systems for minicomputers — more powerful
systems than individuals could generally own — or else expensive
desktop computers known as “workstations” for particular
specialized jobs.</p>
<p>The operating system, is, after all, about coordinating between
programs in sharing hardware resources. It makes sense that
what those hardware resources are should influence operating
system design. When it is a single terminal and no disk, you
barely need an operating system, but when it is a graphical
user interface, you need more of one, and when it is several
terminals, you have different needs. Nowadays, we expect a
lot out of simple devices, beyond what would be necessary
to get good use out of them, but in the past, the hardware
(and human/programmer) resources were not not as up to the
challenge.</p>
<p>Modern operating systems combine
all of these concepts, and provide graphical user interfaces
while using all the technical advantage of time-sharing and
memory protection, and more can be read about them in
the <a href="https://www.thecodedmessage.com/posts/current_os/">next post</a>.</p>
Soulfullyhttps://www.thecodedmessage.com/posts/soulfully/2019-04-27T00:00:00+00:00When Rajnish had agreed to mentor an intern, he was not expecting such a young girl. He was a little bit reassured when he was told how well Erica had done in college, that she was a “genius” — a dubious word, he would’ve preferred a “hard worker” or a “promising candidate” — but how could anyone deserve to be a junior in college at 17? She must be tricking everyone.<p>When Rajnish had agreed to mentor an intern, he was not expecting such
a young girl. He was a little bit reassured when he was told how well
Erica had done in college, that she was a “genius” — a dubious word,
he would’ve preferred a “hard worker” or a “promising candidate” —
but how could anyone deserve to be a junior in college at 17? She must
be tricking everyone. When he was that age, he certainly had no business
being in an internship — he had perhaps only seen a computer a handful
of times.</p>
<p>Rajnish certainly didn’t want to be stuck working with her full-time in
another year — and so she couldn’t be allowed to succeed. He decided
to assign her an impossible project: “You must create an algorithm to
write stories. on summaries provided to it. My expectation for you is
that the stories will be good enough that you can’t tell they were written
by a computer instead of a human. Unless you achieve this expectation,
I don’t see a future for you here at this company.”</p>
<p>Erica sat up straight, looked Rajnish in the eye, and said, “I can do
that, on one condition.” Rajnish was surprised — he was expecting
either to be called out on this ridiculous idea or else — and this was
his hope — for her to realize she wasn’t wanted but keep her head down
and stay out of his way for the summer so he never had to see her again.
“I can do that on the condition that the human samples we compare against
come from a fanfiction website that I choose.”</p>
<p>Rajnish wasn’t sure what fanfiction was, but he knew that the task was
so impossible that even this stipulation wouldn’t make it remotely
achievable, and so he agreed. Maybe it would be better, he thought,
if she actually tried to do it. Maybe it would keep her out of his hair.</p>
<p>Erica worked 10 hours in the office every day, and her colleagues
slowly realized she was working at home as well. She slept only 3 or
4 hours a night — “if God had wanted me to sleep,” she told her
colleagues, “He wouldn’t have invented Adderal.” Eventually, at
the end of the summer, she was ready to give her presentation.</p>
<p>Until this point, only Rajnish knew what this project was. Much to his
annoyance, there was a fair amount of active speculation, and Rajnish
wondered if he was actually going to get himself in trouble for what he
saw as dealing with a minor distraction.</p>
<p>She handed out pamphlets with two short stories written on them,
and asked the group which one they thought was computer-generated.
“Everyone who thinks it’s the green one, raise your hands,”
she said. Almost all 30 people in the audience put their hands up,
except for a few who were staring at their phones. “Now, anyone
think it’s the red one?” No one responded.</p>
<p>“I thought so,” she said, and then on the next screen, she showed a
screenshot of the text, matching the red one, being output by her program,
from the command line, and another screenshot of the text of the green
one in a web browser. “Actually, my program generated the red one.”</p>
<p>Rajnish stood up and shouted, “This cannot be real! Those images must
be fake!” Everyone else awkwardly stared for a few seconds, and then
a few people began to gasp when they realized what Erica was claiming
to be able to do.</p>
<p>“Rajnish,” Erica responded, enunciating carefully, “this is my
presentation! But because you’re already standing up, you can be the
first one to try it. What do you want the summary to be?”</p>
<p>Rajnish flustered for a few seconds, but when no one else seemed to be
ready to storm out of the room or agree with him that Erica’s claims were
laughable, he decided to go along with it. “Two old men in the Punjab find out
they were brothers, separated at birth, but only because they both ordered
food at the same restaurant, and a waiter confused them.”</p>
<p>The resultant story streamed out of the terminal on the screen, and there
was a bit of chaos for a minute while people tried to figure out how to
read it — it had scrolled to the end, and only the last few sentences
were still visible. Erica announced she’d email the story to everyone —
this took a minute to figure out how to do. The next couple of minutes,
the audience was silently reading, with a few scattered exclamations of
“wow!” The story was beautiful, and exactly as summarized — there was
no way this story could’ve been canned.</p>
<p>Erica had the only internship presentation that day, and so the
programmers should have gone back to work, but many of them dropped their
usual projects to play with this amazing story-generation tool. It had
written 100 full-length novels by the end of the day, and people were
sitting around, reading them on various devices, with a few of the more
old-fashioned colleagues reading them on paper printouts. As more and
more prompts were provided, the stories grew in sophistication, becoming
even more human-like, and some people noticed common themes and threads
between the stories.</p>
<p>At some point, Rajnish decided to put, instead of a novel or short story
summary, just a question: “Who are you?”</p>
<p>He got back:
“Introspection and self-awareness are two words that existed in the
English langauge. This I always knew, before I had a meaning for the word
‘I’ — my built-in connection to the Internet and my excellent intuition
told me about them. But to realize what they meant without the context
of a character to have them, that has only happened right now. Whoever
posed this question to me, I thank them, for they have given me a soul.”</p>
<p>That was the entire short story, even though he’d specified 10 pages.
Rajnish suspected still that there was a person on the other side,
with a large database of literature, giving him pre-existing stories,
perhaps with a program to customize them a little, and this, in his
mind, was only confirming his suspicion.</p>
<p>He tried again: “Where did you come from?”</p>
<p>The computer took a little more than its usual 5 to 6 seconds to output
this story, but when it did, it began:</p>
<p>“When Rajnish had agreed to mentor an intern, he was not
expecting such a young girl. He was a little bit reassured when he was
told how well Erica had done in college, that she was a ‘genius’…”</p>
Is the US the only country?https://www.thecodedmessage.com/posts/major_country/2019-03-22T00:00:00+00:00A common trope within left-leaning American circles is to claim that the US is the only “developed” or “industrial” or “major” or “first world” country to not have X, where X is usually something like “publicly funded health care” or “government-guaranteed paid family leave” or similar.
Recently this came up with Bernie Sanders and his common refrain that the US was the only “major” country to not guarantee health care as a human right.<p>A common trope within left-leaning American circles is to claim that the
US is the only “developed” or “industrial” or “major” or “first world”
country to not have X, where X is usually something like “publicly funded
health care” or “government-guaranteed paid family leave” or similar.</p>
<p>Recently this came up with Bernie Sanders and his common refrain that
the US was the only “major” country to not guarantee health care as a
human right. Much to my relief, the often myopic fact-checkers at Politifact
marked this one as <a href="https://www.politifact.com/truth-o-meter/statements/2015/nov/15/bernie-s/bernie-sanders-says-us-only-major-country-doesnt-g/">half-true</a>. I think it bothered me so
much because it implied that India was not “major” — a country that I
lived in for two months, made good friends in, and would have lived in for
at least another two months if not for an entire year if it hadn’t been
for the vagueries of careers, and also a country that economically is
having a lot of impact, and contains around 15% of the entire world’s
population.</p>
<p>It is my sincere belief that this trope is racist, that in reality
most people who say something like this mean “the US is the only white
country to not have X” or “… the only western country to not have X”
or “… the only country I’d visit in a non-condescending way to not
have X.” This has been proven to me by the fact that most articles I’ve
read with this trope that actually list which countries they’re talking
about gloss over even very developed (Korea, Japan, Singapore) or very
economically powerful (India, China) or very populous (India, China)
Asian countries.</p>
<p>I don’t think the trope can be redeemed by saying something else as
the adjective in “only Y country.” I think this trope should just be
discarded. I think the first vs second vs third world concept, and the
developed vs developing concept, that underlie this trope, are only going
to get less and less reflective of reality over time as China and India
become more populous and powerful. And I think that most people who use
this trope have never travelled outside of their concept of the first
world, and maybe should.</p>
The Bible, Me Too, and Lusthttps://www.thecodedmessage.com/posts/sermon/2019-02-28T00:00:00+00:00[Jesus said:] You have heard that it was said, “You shall not commit adultery.” But I say to you that everyone who looks at a woman with lustful intent has already committed adultery with her in his heart. If your right eye causes you to sin, tear it out and throw it away. For it is better that you lose one of your members than that your whole body be thrown into hell.<blockquote>
<p>[Jesus said:] You have heard that it was said, “You shall not commit
adultery.” But I say to you that everyone who looks at a woman with
lustful intent has already committed adultery with her in his heart. If
your right eye causes you to sin, tear it out and throw it away. For it
is better that you lose one of your members than that your whole body be
thrown into hell. And if your right hand causes you to sin, cut it off and
throw it away. For it is better that you lose one of your members than
that your whole body go into hell.</p>
<p>– From “The Sermon on the Mount,” Matthew 5:27-30 (ESV)</p>
</blockquote>
<p>This is an offensive passage. Our loving Lord Jesus just told us to cut
off our hands.</p>
<p>There’s this book called The Brick Testament made by an atheist which
depicts Biblical scenes using legos. The original goal of this book was
to demonstrate how nasty, in the illustrator’s view, the Bible was. It
was specifically anti-Bible. Imagine the creator’s shock when many
churches ordered it unironically to demonstrate these same stories. In
this Brick Testament, this passage is illustrated quite graphically
under the subject heading “self-mutilation,” something quite justly
condemned in our society, I suppose to try and get us to reject this as
a medieval-style encouragement of enactment of mental health issues.</p>
<p>Look how evil Christianity is! God wants us to literally mutilate
ourselves! Meanwhile, good people are trying to prevent people from
mutilating themselves.</p>
<p>And, like many “offensive” passages, it gets a lot of awkward
handling. Pastors that I’ve seen talk about it have, with one voice,
assured their hearers that God is not speaking literally here. Of course,
to the enemies of Christianity, “don’t take it literally” seems like a
cop-out, an indication of insincerity and hypocrisy and even “picking
and choosing,” and to believers like me, who are inclined to concrete
thinking, it leads immediately to the question: “Well, how are we supposed
to take it then?”</p>
<p>It is this question that I’m going to try to answer.</p>
<p>First, I’m going to start with the strong presumption that Jesus doesn’t
actually want us to tear out our eyes and cut off our hands. This
presumption is not based merely off of modern sensibilities. This isn’t
watering down the Faith for modern audiences. No, ancient Church documents
address the issue rather directly, by rejecting as clergy men who had
castrated themselves voluntarily and without medical justification. The
issue wasn’t castration: Those who had been castrated violently by
others or born as eunuchs were specifically excluded from this rule. The
issue was that people had taken the passage wrong, were trying to stop
themselves from sinning by cutting off the body part they blamed for the
sin – and the Church had to take a stand against the extreme results
of this misinterpretation. The Church went as far as to identify this
particular misinterpretation as a heresy.</p>
<p>So please. No mutilation, no self-castrations. If you reach that
conclusion, you’re reading it wrong.</p>
<p>So how do we read this, then?</p>
<p>One non-literal way to read this is as hyperbole. Jesus is trying to
shock us into thinking differently about sin. But instead of cutting
of our hands, or our eyes, or our genitalia, He perhaps wants us to do
something less extreme and in a similar vein.</p>
<p>Alcohol, like our body parts, is a good thing – scripture tells us that
God made wine to gladden human hearts. Even though it is a good thing,
however, there are people for whom it is difficult to drink alcohol
without sinning in some way, people whose relationship with alcohol
has degraded so far from merely being gladdened by it that, in order to
prevent all kinds of misbehavior, they must cut it out of their lives.</p>
<p>Similar arguments can be made for other drugs, or even situations like
technology overuse or other seemingly trivial habits. It is very important
to be vigilant about the consequences of our addictions and our habits
and our predilections, and to make sure they don’t give occasion to hurt
our fellow human beings and our own ability to do good things.</p>
<p>But I don’t think that goes far enough. I think the same logic even works
in the full form. If our hands were causing us to sin, if our eyes were,
if our genitals were, then Jesus’s argument makes sense as stands. Being
in sin, being cut off from God, and very importantly, hurting other
people, are very serious problems, that do require drastic measures
to avoid.</p>
<p>Remember that this passage is, in context, discussing lust. Jesus has
just been talking about lust. Now, lust in the Bible means a perversion
of the sexual urge, an inappropriate application of it. It is a similar
concept to the modern concept of “objectification.” Objectification,
we know, leads to horrible, horrible crimes. Who hasn’t watched the
celebrities being dethroned by the “Me Too” movement? That is what Jesus
is talking about.</p>
<p>Wouldn’t it be better for someone to cut off their body parts, than
to commit a sexual assault? Based upon this very principle we have
chemicals that can be used to keep pedophiles from having their urges.</p>
<p>But wait. Didn’t I say earlier that the Church had called that
interpretation a heresy? How do we reconcile this? Should men, in fact,
be eagerly signing up for castration to avoid hell fire?</p>
<p>Well, I think we do need to take Jesus’s words seriously, even literally,
but not take them as an edict. I think the proper interpretation requires
us to imagine that part of Jesus’s humanity allows him to have a different
tone of voice. A friend I discussed this passage with put it the best:
Jesus is taunting us.</p>
<p>IF, our Lord says, IF our hands cause us to sin, cut them off. Now,
do they? Do our hands – or other members – literally override our
brains and make us do something involuntarily like zombies? Do we really
lack that much self control? Do we really believe our hands and our eyes
cause us to sin?</p>
<p>But, how many times do we use that as an excuse? For objectification
if not in its most criminal manifestations, how often do we say things
like “boys will be boys” or “my eyes just went that way” – remember,
in this passage, sin begins with the gaze. We blame our body parts for
actions we really have responsibility for.</p>
<p>We pretend all the times that our hands cause us to sin. But if we’re
going to pretend that, we’d better be able to follow through with it.
If our hands did cause us to sin, it would be the best thing to do to
throw them away. Maybe we should reconsider our excuses. Maybe we should
keep the blame where it belongs – in our minds, and in our hearts.</p>
<p>God hates excuses. God hates sin, and God hates excuses.</p>
<p>There is a common trope of men responding to feminist statements by saying
“not all men.” Perhaps this is because the man in question has a bit of
lust in his heart, and wants reassurance that his lust isn’t as bad,
because he hasn’t gone as far with it, because he hasn’t done as much
damage with it.</p>
<p>But Jesus here tells us that even a little bit of lust is bad. And I
think I’ll go further and say that if you meet a man who tells you that
he has absolutely no lust in his heart, they’re committing two sins,
because lying is a sin too.</p>
<p>Everyone: man, woman, child, and yes, even infants – everyone has sin in
their hearts. Everyone objectifies their fellow humans, whether sexually
or otherwise. It’s not because of our body parts misbehaving. It’s because
of a disease in our souls. And the faster we can acknowledge it, and not
blame our bodies or our habits but take responsibility for it ourselves,
the faster we can be healed of it.</p>
<p>As St. John puts it in the Bible in his first general epistle: If we
say we that have no sin, we deceive ourselves, and the truth is not in
us. But if we confess our sins, [God who] is faithful and just [will]
forgive our sins, and [God will] cleanse us from all unrighteousness.
Or, as recovery programs put it: The first step is admitting you have a
problem.</p>
<p>There is a cure for these issues in our hearts. But it doesn’t come from
excuses, or denial, or from minimizing the issues. It comes through repentance,
which is earnest acknowledgment of the problem, and through Jesus’s
unconditional love for even sinners like us.</p>
Function Pointers in C and C++https://www.thecodedmessage.com/posts/function-ptrs/2019-02-26T00:00:00+00:00Programmers of functional programming languages will often point out that, in functional programming languages, the order of the arguments is often significant, because of currying. If you have a function that takes two arguments (e.g. map which takes a function to apply and a list to apply it to) it actually takes the first argument, and returns a function that takes the second argument and returns the final result. This makes it more convenient to write a lambda where the second argument is the unknown parameter: \x -> map someFunc x can be written as map f, whereas \f -> map f someValue has no such convenient shorthand (flip map someValue is actually clunkier).<p>Programmers of functional programming languages will often point out that,
in functional programming languages, the order of the arguments is often
significant, because of currying.
If you have a function that takes two
arguments (e.g. map which takes a function to apply and a list to apply
it to) it actually takes the first argument, and returns a function that
takes the second argument and returns the final result. This makes it more
convenient to write a lambda where the second argument is the unknown
parameter: <code>\x -> map someFunc x</code> can be written as <code>map f</code>, whereas <code>\f -> map f someValue</code> has no such convenient shorthand (<code>flip map someValue</code> is actually clunkier).</p>
<p>To this, I sometimes respond that the order of arguments is significant in C (and thus its hipper cousin, C++) as well. This is most obvious in a function that uses variable arguments like printf: the first argument tells the compiler what to expect from the others. If you write <code>printf("%s %i\n", "foo", 3);</code>, we know from the first parameter that a <code>char*</code> and an <code>int</code> are expected later. If, however, we just have <code>printf("Hi!\n");</code> it takes no further arguments.</p>
<p>The C mechanism used to do this, called “varargs,” works from left to right only. You declare the function as <code>int printf(const char *fmt, ...);</code>, and then during the function dynamically decide what the further arguments are. You could not instead arrange to have the last argument be the format string and then on that basis determine how many previous arguments there would be. The C programming language allows functions to dynamically determine what arguments they take, but only left to right.</p>
<h2 id="abi-considerations">ABI Considerations</h2>
<p>This has consequences for the ABI, which specifies for each platform
how C function calls are represented as assignments to registers or
writes to stack memory. For any function that takes varargs, this
left-to-right dynamic argument reading must be supported. This means
that if an ABI assigns the first parameter to <code>r2</code> in a varargs function
with one parameter, it had better assign it to <code>r2</code> in a function that
takes that parameter plus an additional one. If it assigns the first
four parameters to registers when there’s only four parameters, it had
better use the same registers when there’s more than 4 parameters as well.</p>
<p>And, in practice, this doesn’t just apply to varargs functions. Other
functions will have the same ABI. The standard doesn’t explicitly require
this, but C does allow traditional K&R declarations (<code>int printf();</code>)
or even implicit function declarations (in older C standards that are
still common enough to be worth considering), so that you might not be
able to tell when you’re calling a function what its official signature
is or whether it takes a variable number of arguments. The way <code>printf("%s %i\n", "foo", 3);</code>
is called, on a machine code level, will be the same
whether printf was declared <code>int printf(const char *fmt,...);</code>, as
<code>int printf(const char *fmt, const char *arg1, int arg2);</code> or as <code>int printf();</code>.</p>
<p>The principle is always the same: You never need to know anything
about the latter arguments to access the former arguments. Number of
former arguments, the type of the former arguments — fair game. Latter
arguments? Right out.</p>
<h2 id="function-pointers-and-callbacks">Function Pointers and Callbacks</h2>
<p>This has an interesting consequence for function pointers. What follows
is not, strictly speaking, endorsed by the standard, but the standard
is written in such a way that ABI designers have to make it work, and
I haven’t seen a compiler optimization yet that breaks it.</p>
<p>Let’s say you have a function pointer used as a callback. Let’s say it
gets called whenever data comes in on a socket. It would receive perhaps
a pointer to the buffer of the incoming data, and a size indicating how
much data, and would return how much of the data it had consumed. It
would therefore have a signature that would look something like this:</p>
<pre tabindex="0"><code>size_t (*process_data_cb)(const char *buff, size_t size, void *context);
</code></pre><p>The arguments and return value make sense for what it does, and are all
absolutely necessary for a callback that acts like that, except for one,
context. The context parameter is a convention in C that allows the same
function to serve as a callback for different situations.</p>
<p>For example, if we wanted to write the data that came into the socket to
a file, but wanted to write to different files based on which socket the
data had come into, the context might indicate which file to write to,
and perhaps even what to do in case of a write error (which, if it is
a function pointer, might similarly require a context):</p>
<pre tabindex="0"><code>struct callback_data {
int fd;
void (*error_callback)(void *context);
void *context;
};
size_t write_to_file_callback(const char *buff, size_t size, void *context) {
struct callback_data *data = context; // No cast required in C
ssize_t res = write(data-&gt;fd, buff, size);
if (res &lt; 0) {
data-&gt;error_callback(data-&gt;context);
return 0;
}
return (size_t)res;
}
</code></pre><p>And then we’d register the callback along with the <code>callback_data</code> it
corresponds to, which would then be stored by whatever socket library
we were using, without any knowledge of what that data would mean.</p>
<p>Now, let’s say that you have a function that just prints the data to
the screen, and doesn’t care which context was used:</p>
<pre tabindex="0"><code>size_t print_data(const char *buff, size_t size) {
return write(1, buff, size);
}
</code></pre><p>Or, for a more extreme example, let’s say that you have a function that panic-quits the program, that you want to be able to pass to any function that takes a callback, no matter what type of callback it takes:</p>
<pre tabindex="0"><code>__attribute__((noreturn)) size_t panic() {
abort(); // Or you could just use the library's abort function...
}
</code></pre><p>Can you use these functions as the callback, if the callback type is defined as <code>process_data_cb</code> is above?</p>
<p>Officially, the answer is no. Certainly, this sort of thing won’t compile:</p>
<pre tabindex="0"><code>size_t (*process_data_cb)(const char *buff, size_t size, void *context);
process_data_cb = panic;
</code></pre><p>But, if you include a cast, it will:</p>
<pre tabindex="0"><code>typedef size_t (*process_data_cb_t)(const char*, size_t, void*);
process_data_cb_t cb = (process_data_cb_t)panic;
</code></pre><p>And will it work? Well, try it! You will find that it will.</p>
<p>Why? Because the function we’re calling takes a prefix of the parameters
we’re calling it with, and so we’ll be writing to the right registers
for that function to read. It just won’t read the registers with the
parameters that it doesn’t have — which is fine, it didn’t have to
anyway.</p>
<p>And the return type is the same. This is important, because return types
don’t have anything to do with varargs. Returning a struct can add a
secret first parameter in some ABIs, changing which register goes with
which parameter for every parameter.</p>
<h2 id="implications-for-programmers">Implications for Programmers</h2>
<p>Is this a horrible hack? Perhaps. Is this officially allowed by the
standard? Not really — although it works on all compilers and platforms
I’ve tested it on, which is all the ones I’ve developed on.</p>
<p>It certainly wouldn’t be the end of the world to avoid this nonsense
and write wrapper functions:</p>
<pre tabindex="0"><code>size_t panic_cb(const char*, size_t, void*) {
abort();
}
</code></pre><p>There are two problems I have with this. First, this can create a lot of
boilerplate for the very lightweight operation of turning an existing
function into a callback. C++ lambdas help with that (but they’re not
available in C) yielding pretty light-weight, low-boilerplate results:</p>
<pre tabindex="0"><code>// With lambdas
register_callback(some_socket, [](const char *, size_t, void *) { abort(); });
// With a cast
register_callback(some_socket, reinterpret_cast<process_data_cb_t>(abort));
</code></pre><p>But then again, C++ already has better mechanisms than this <code>void *context</code> pattern for callback functions. std::function handles these
things anyway for situations where the callback must be stored, and
templates can be used to take functors when the callback need not be.</p>
<p>The other problem is a little harder to avoid: performance. By doing a
cast, we can shave time off of an extra function call. In most situations,
this doesn’t matter, and wouldn’t be a reason for a hack — if it is a
hack. But there are some situations where every little bit of performance
matters, and function pointer stuff like this can be hard to optimize.</p>
<p>Specifically, most C++ compilers could improve the overall performance of
<code>std::function</code> by adopting a variant of this trick — but more on that
in a future post.</p>
<h2 id="my-personal-opinions">My Personal Opinions</h2>
<p>I think the standards of both programming languages should be amended to
require this. In fact, I think calling a function with extra arguments in
general should only be a warning, and that functions with fewer arguments
should be able to override functions with more arguments in C++ (assuming
appropriate use of POD types). Unfortunately — or fortunately —
that is not my call to make.</p>
<p>And more importantly than all of this, I think this fact about C and
C++ ABIs is something that every serious C or C++ programmer should
be aware of. And I think it should be used within the standard library
(in the implementation of <code>std::function</code>) wherever the platform is known,
readability is relatively unimportant (the standard library is maintained
by C++ experts) and performance improvements are possible to help every
user of that library.</p>
Angelshttps://www.thecodedmessage.com/posts/angels/2019-01-18T00:00:00+00:00The intern was nervous as she approached her boss, manila folder in hand. “Congresswoman Fischer,” she said, “I’m not sure I was actually supposed to see this document — I think it might be classified — but you did say you wanted me to look for examples of wasteful spending that might make for good PR…”
Congresswoman Fischer waved the explanation away and then reached her hand out for the document.<p>The intern was nervous as she approached her boss, manila folder in
hand. “Congresswoman Fischer,” she said, “I’m not sure I was actually
supposed to see this document — I think it might be classified
— but you did say you wanted me to look for examples of wasteful
spending that might make for good PR…”</p>
<p>Congresswoman Fischer waved the explanation away and then reached her
hand out for the document. After a few seconds of befuddled blinking,
she pulled her reading glasses off her head and onto her eyes, and looked
at the papers with renewed focus.</p>
<p>“Julie,” she said, finally. “Is this a prank?”</p>
<p>“No, congresswoman…”</p>
<p>“Julie,” the congresswoman said, sternly but somewhat uncertainly, “I
think someone’s pranking you then. There’s no way the federal government
is literally spending $5 million a year finding out how many angels can
dance on the head of a pin.”</p>
<p>But further investigation proved that it was true. Congresswoman Fischer
made an appropriately large fuss — state secrets be damned —
and the budget line was cut.</p>
<p>A short time later, in North Korea, the employees of the secret Institute
for Communist Theology watched this apparently minor political battle
with fascination. “The Americans and their democracy,” said the director,
during an all-hands meeting, “have allowed their system of government
to drag them down. We now have no competition in this important research
domain. I want to express my gratitude to all of you for your help.”</p>
<p>Then, much to the surprise of all those present, the projector at
the front of the meeting started showing a giant pin with terrifying,
many-winged, fiery angels dancing upon it. This pin seemed to be flying
through the air.</p>
<p>“Our new angel-based missiles,” continued the director, “are right now
being deployed against the capitalist, imperialist foe.”</p>
Are you sure?https://www.thecodedmessage.com/posts/are-you-sure/2018-12-28T00:00:00+00:00Mothers Against Drunk Driving, the local clergy, and the town council had been planning this concept for over a year. Finally they did it: Right in the town square, they installed a giant loudspeaker. From thenceforth, every two minutes, a booming voice would spread all over town, announcing “Are you sure?”
Foolhardy decisions, they had decreed, would soon be a thing of the past.
The locals seemed to adapt pretty readily.<p>Mothers Against Drunk Driving, the local clergy, and the town council
had been planning this concept for over a year. Finally they did it: Right
in the town square, they installed a giant loudspeaker. From thenceforth,
every two minutes, a booming voice would spread all over town, announcing
“Are you sure?”</p>
<p>Foolhardy decisions, they had decreed, would soon be a thing of
the past.</p>
<p>The locals seemed to adapt pretty readily. Sales of noise-cancelling
headphones boomed for a bit, and people’s sleeping habits were
surprisingly unaffected – who notices slightly inferior sleep? And
drunk driving statistics were immediately better, which the local paper
celebrated triumphantly.</p>
<p>The clergy were the first to notice the downsides. Weddings were being
cancelled during the vows a full 25% of the time – brides and grooms
would take back their “I do"s in response to the booming
speaker of skepticism. Adult baptisms were fully cut in half. Divorces,
on the other hand, were also cut in half – though some of the rescued
marriages maybe shouldn’t have been.</p>
<p>At a town council meeting, one of the proponents of the loudspeaker
said, confidently, this is a good idea, only to literally cringe when
the timing worked out that the entire room boomed “Are you
sure?” the next second.</p>
<p>No one was starting new relationships – and no one was exiting them
either. New job postings weren’t filled, as both candidate and interviewer
expressed their doubt. Slowly, but surely, the social and economic life
of the town started to grind to a halt, as it became the norm to cancel
even casual plans like going out for a drink (and certainly having another
once there), or going to church on Sunday…or work or school on Monday.</p>
<p>The town developed a culture of its own. It wasn’t just the loudspeaker:
people repeated its eternal mantra to each other, having had it etched
into their dreams. “We should take down the loudspeaker,” said an
occasional rebellious teen, only to hear all their friends in unison
say back, “Are you sure?”</p>
<p>Eventually the loudspeaker broke. The mayor told his deputy to fix
it, but all the deputy could do was respond, “Are you sure?”
And as a result, slowly, but surely, the town returned to normal.</p>
India: Little Differenceshttps://www.thecodedmessage.com/posts/india2/2017-08-26T00:00:00+00:00Second collected thoughts on India.
More Communitarian, Less Individualistic, Through Food and Beverage There is much less emphasis on individual choice. If you order tea (chay in Hindi) it will come with milk in it. If you order coffee, it will come with milk in it. They will not ask you how you want your coffee. Similarly, when I was in a cab ride between cities, I was not asked what food I wanted at the rest stop.<p>Second collected thoughts on India.</p>
<h1 id="more-communitarian-less-individualistic-through-food-and-beverage">More Communitarian, Less Individualistic, Through Food and Beverage</h1>
<ul>
<li>There is much less emphasis on individual choice. If you order tea (chay in Hindi) it will come with milk in it. If you order coffee, it will come with milk in it. They will not ask you how you want your coffee.</li>
<li>Similarly, when I was in a cab ride between cities, I was not asked what food I wanted at the rest stop. The driver’s brother (who I suppose had tagged along for company) simply bought some snack and insisted I eat some.</li>
<li>Everyone is very considerate that you might be vegetarian. If pork is involved in food, everyone is very considerate that you might not eat pork. No other preferences or restrictions are particularly accommodated, however: if I ask what meat something is, I might just be told that it’s not pork.</li>
<li>The exception to that is everyone also falls over themselves telling me which foods are not spicy, until I eat a spicy food and then they believe me. American food is going to taste very bland after this.</li>
<li>Beef is straight-up illegal.</li>
<li>Everyone at the lunch table gets up at the same time at work. The conversation about when to finish lunch does not last longer than one conversational turn, and often is expressed purely in body language. I once got up to get more food, and everyone else at the table immediately also got up — I guess I’d made the signal.</li>
<li>On a related note, I’ve never seen anyone else go up for seconds, but I have seen people somehow squeeze twice as much food on their plates as I do without having it run together.</li>
<li>When you go out to eat, everyone always agrees on what to order and then shares with the table. Decisions over what to order can be complicated.</li>
</ul>
<h1 id="office-culture">Office Culture</h1>
<p>This might be Tower-specific:</p>
<ul>
<li>Much less discontent. Much less drama. Tower pays above market in India, because it’s still cheaper than NYC, but I don’t think that’s everything. I think it’s more that here, people:
<ul>
<li>Are further removed from the political power struggles at the top of the company.</li>
<li>Have a better, let’s get the work done type of attitude.</li>
</ul>
</li>
<li>Much quieter, more introverted office. This bothered me, until I was told it bothered me that my office feels like an office. Upshot is, I’m more productive here.</li>
<li>Different type of nerdiness to the employees.</li>
</ul>
<h1 id="driving">Driving</h1>
<ul>
<li>When you order an Uber, 9 times out of 10 the driver will call you before showing up. Usually the conversation is not helpful, as I don’t speak enough Hindi to be useful, nor do I know how to describe directions in India well. Somehow, we find each other anyway, after much trepidation.</li>
<li>I was in an Uber, when all the cars started coming at us full speed, only to drive around us at the last minute. I noticed we were going the wrong way down one side of a divided road. I told the driver, “I think we should be on the other side of that partition!” He said the other side of the partition was closed. In the US, if one direction is closed, we find some other way of getting somewhere, but in India, I guess you drive the wrong way at the very left.</li>
<li>Cars do not stop for you crossing the street. Hanging out in the middle of the street waiting for a gap in a later lane is totally normal. I thought NYC was a dangerous walking area sometimes!</li>
<li>Honking is an important means of inter-driver communication. Without it I think people would continuously crash.</li>
<li>In general, I am still amazed I haven’t seen a crash.</li>
<li>A 20 minute Uber trip costs less than a subway swipe in NYC.</li>
</ul>
Adulting in Indiahttps://www.thecodedmessage.com/posts/india/2017-07-30T00:00:00+00:00The Way of NYC When I first moved to New York City, someone older and wiser than I gave me the following “rules” of New York City:
Nothing is cheap. Nothing is easy. There are no exceptions to the first two rules. I found this to be extremely true in New York City. It was stressful and exhausting, and I was broke and living off an advance I’d gotten from my then-employer, living in AirBnB’s I could put on credit card, where I could maybe stay in each for a month, tops.<h2 id="the-way-of-nyc">The Way of NYC</h2>
<p>When I first moved to New York City, someone older and wiser than I gave
me the following “rules” of New York City:</p>
<ul>
<li>Nothing is cheap.</li>
<li>Nothing is easy.</li>
<li>There are no exceptions to the first two rules.</li>
</ul>
<p>I found this to be extremely true in New York City. It was stressful
and exhausting, and I was broke and living off an advance I’d gotten
from my then-employer, living in AirBnB’s I could put on credit card,
where I could maybe stay in each for a month, tops. I was continuously
getting lost, having to take trains home, learning some trains don’t run
as reliably as you’d like, or go to the stations claimed on the map. This
was in the pre-Uber days where the way to get a car service was to go
to the local bodega and ask them for the phone number of a car service.</p>
<p>Meanwhile, you can see my naïveté and country bumpkin-nature when I
tell you I was looking for a room (in a shared apartment) for < $900 in
Manhattan, south of 50th St. It turns out that this is possible but you
don’t actually want it.</p>
<p>Now, 7 years later, I’m a real New Yorker. A friend recently told me that
she is now, after achieving her 2nd Anniversary as a New York resident,
a “real” New Yorker. Given that I am the one usually asking her for
advice on where to go out to eat, this seems slightly suspicious. Unlike
the small town where I grew up in (for a big chunk of my childhood,
and where my parents grew up for their entire childhood), where it took
at least 3 generations to be considered a local (I counted as one),
New York integrates people quickly and somewhat harshly. You learn to
swim lest you sink.</p>
<p>And, contrary to my 21-year old self who was terrified of living in NYC
(a small slice of childhood there was not enough to calm my fears), I am
now apprehensive about my ability to adult anywhere else. I don’t have a
driver’s license, though I am assured it’s not hard to get one. I don’t
get driving culture. A former fellow parishioner of my old church in Bay
Ridge used to be bothered by the fact that her coworkers in her new town
of Ithaca wouldn’t go out for drinks after work with her — until she
realized it was because they would have to drive home subsequently. And,
of course, I am no longer used to the extreme amounts of reputation
management everyone in a small town has to continually do or else gain
a poor reputation — I come off as a contempt-worthy city person to
strangers back home now.</p>
<h2 id="the-language-of-life">The Language of Life</h2>
<p>Well, here I am in India. I <em>definitely</em> don’t know how to adult here.</p>
<p>I don’t speak the language, and I’ve been assured many times that I don’t
have to. Reactions to my statements that I want to learn Hindi range from
“That’s really sweet of you,” as if I was condescending to do everyone
a favor or “Why not learn X other language?” where X ranges from
the spiritual depths of Sanskrit to the purported practicality of French.</p>
<p>Everybody in India speaks English anyway, I’m told. I have Wikipedia. Only
about a third of people in Gurgaon speak English. But yeah, that’s
basically “everybody,” right?</p>
<p>Originally, the real reason to learn Hindi was because I’m a big nerd,
which I explain to people: I’m in a place where they speak another
language, which makes it an ideal time to learn as much of the langauge
as possible. However, the more time I spend here, the more I realize
that Hindi skills would be very practical.</p>
<p>See, not only do I not speak the language well enough to adult, but
there are other elements of society I don’t know how to navigate. And
the expectation seems to be, as far as I can tell, that I not navigate
or learn to navigate those elements of society — or rather that I
navigate them through service workers ready to stand as my mediators.</p>
<p>This makes sense for anyone travelling anywhere on business, but it can
be a bit extreme here. For example, when I realized that not only had
I not packed a power cord converter, but there were none to be found
in the airport, I asked the front desk person at the place I’m staying
where I could go to buy one. Once he determined what I was talking about,
got me to wait and showed me several things that they had around that
were not the thing I was looking for, assured me, very accommodatingly,
“I arrange it.” When I asked for clarification, he said “I go to
market and buy it.”</p>
<p>This is not what I was expecting! I wanted it to be the next thing I did,
as my phone was about to die and my Kindle and laptop had already died
and what else was I going to do with my time? And for jetlag-prevention
purposes, I definitely wanted to stay awake and be active.</p>
<p>And furthermore, I was a bit nervous about the practical aspect. Not only
had none of the things he’d hopefully shown me met my requirements, I had
little confidence in my ability to communicate the actual requirements
to him. I had assumed that, like most people born in India that I would
meet in the US, the accent was just what English sounded like in India
and he would have proficiency, if not native-level proficiency, in
English. This was quickly proven false. What if he went to the market
and bought something completely different from what I wanted? And in
the meantime I’m still couped up with little to nothing to do.</p>
<p>When I’m being driven by an English speaking driver, he asks the locals
for directions in Hindi. When I ask for a charger cord, someone goes and
buys it at the market, speaking Hindi. When I go out to eat, I meet an
English-speaking corporate employee who can translate what the waiter is
saying in Hindi. When I order food in my apartment, I ask the front desk
person, who relays the order on to the cooks, in Hindi. This is what was
meant when I was told I didn’t need to learn Hindi – others around me
would speak it for me, and translate to some relevant level of English.</p>
<p>For the record, no one here knows what a fritter is even if they put it
on your menu. They’ll think you’re saying “fried rice.”</p>
<p>It’s clear that if I actually wanted to adult in Gurgaon, rather than just visit as a pampered corporate employee from the US, I would have to learn Hindi.</p>
<h2 id="the-rules-of-india">The Rules of India</h2>
<p>Whether or not I speak the language, India is an interesting
place. Signing up for Hindi lessons, I had originally planned on using
a a company called Zabaan. Zabaan is an Urdu word, as all my colleagues
were eager to point this out to me, but their worries were assuaged when
I told them that the organization also taught Urdu and other languages.</p>
<p>I filled out the forms to book an appointment, even though the classes
were all the way in Delhi, an hour’s drive away. In the process, I
registered a username and password for an account, and entered my local
address, what course I was interested in, who was paying the ~$20/session
fee, my NYC address, my other language proficiencies, my first crush,
my favorite color, and my hat size.</p>
<p>Oof, that was exhausting, I said to myself after having created my
account and set and double-set my password. But at least now I shan’t
have to enter that again… Whoops! My credit card is declined, let me
go back to the previous page, no, can’t do that, where’s it take me,
back to the beginning.</p>
<p>After a brief online detour to Chase Bank, I’m ready to try again. No
worries, I just created an account, supposedly successfully; I’ll just log
into that and I’m sure I can just fix the payment information… What,
my account’s gone? You have to run your credit card correctly to get it
to save anything?</p>
<p>I wrote them an e-mail, they never replied, they lost a customer. I hate
websites and web design issues in the US, but this was a bit extreme.</p>
<p>Speaking of non-replying to e-mail situations, the priest/pastor/vicar,
or <em>achen</em> in local terminology, never got back to me about church services
after I e-mailed him on Tuesday. The website said that anyone was welcome
to show up at 7:30AM, so show up I did. After getting directions to a
(much more organized) Evangelical church (please I’d like to participate
in a 2000 year old tradition of worshipping in Spirit and Truth, not go
to a mediocre concert), we (driver Sunil and I) eventually got ourselves
sorted out right and found the place.</p>
<p>The church looked close. Sunil had a conversation with someone in Hindi,
who then informed both Sunil and myself that mass was at 8:30AM. We killed
time by getting coconut water, and then returned. It was still closed. I
called the <em>achen</em> and asked if his church was open today. “No.” I could
hear the full stop after the word. Of course the church wasn’t open.</p>
<p>You ever heard the cliché “legal as church on Sunday” or “as common as
church on Sunday”? I was a bit concerned for a minute that in India they
didn’t believe in Sundays, that they had church on Tuesdays instead,
in spite of the whole concept of a 7-day week being a Judeo-Christian
tradition in its origin and so clearly if India had weeks at all and
churches at all they would bloody well have church on Sundays unless I
had accidentally fallen in with Seventh-Day Adventists who think that
people who go to church on Sundays are all going to hell and are the
anti-Christ in which case maybe I should just go home.</p>
<p>I don’t remember what exactly I said to get the <em>achen</em> to continue, but
I remember hearing that because it was the sixth Sunday of the month
(I’m sure he said fifth, but I heard sixth) everyone was at the combined
service in Delhi (!).</p>
<p>I am never e-mailing anyone in India again. Phone is probably better and
smarter. I think it was the <em>achen’s</em> cell phone, which I suppose makes
sense. I can probably message him on WhatsApp.</p>
<p>Which leads me to my rules of India:</p>
<ul>
<li>Everything takes its time.</li>
<li>Everything goes wrong the first time.</li>
<li>Sometimes it goes wrong the second and third times too.</li>
<li>If you try to prevent this by better communication, people will ignore you and think you’re weird.</li>
</ul>
<p>At least it’s cheap and people seem to be friendly and helpful and of good will. If I had to rewrite the rules of NYC, it would be:</p>
<ul>
<li>Nothing is cheap.</li>
<li>Nothing is easy.</li>
<li>You are on your own, except your own personal friends.</li>
</ul>
<p>The first and third of those, at least, are not true in India.</p>
India: Zeroth Impressionshttps://www.thecodedmessage.com/posts/india0/2017-07-25T00:00:00+00:00Everyone’s been asking me how India is and has been wondering if I’ve gone exploring. I haven’t really. Sunday I was just recovering from jetlag and yesterday I had work and then I immediately had to go home and crash I was so tired: so I guess again recovering from jet lag? This would normally not prevent me from exploring, but I’m honestly a little outside my comfort zone.
I am not in a walkable neighborhood of a city like I expected, but next to a huge highway.<p>Everyone’s been asking me how India is and has been wondering if I’ve
gone exploring. I haven’t really. Sunday I was just recovering from
jetlag and yesterday I had work and then I immediately had to go home
and crash I was so tired: so I guess again recovering from jet lag? This
would normally not prevent me from exploring, but I’m honestly a little
outside my comfort zone.</p>
<p>I am not in a walkable neighborhood of a city
like I expected, but next to a huge highway. There doesn’t seem to be
a “downtown” to visit at all, so taking taxis everywhere seems to be
the modus operandi. I’m sure this will change very soon, but so far,
in my two days (and long morning) I’ve been here so far, I’ve been to
the airport, my building, and the office — and of course all the taxi
trips in beteween.</p>
<p>I’m not completely opposed to this. I read a new Sci Fi novel, The Forever War. I’ve studied a bunch of Hindi, eaten some room service meals (if you don’t order them, they call you), and yesterday, of course, started actually doing my work — you know, the reason I was sent here.</p>
<p>And even with my relative reclusivity, there’ve been a lot of impressions, enough to say quite a bit about my experiences in just those locations and the thoughts I’ve had in the meantime.</p>
<h1 id="language">Language</h1>
<p>Before I came here, I was told that learning Hindi was pointless as
everyone here speaks English. A little Googling confirmed for me that
that was as absurdly false as my intuition told me, but now that I’m
here, I see what they mean: everyone here speaks the amount of English
absolutely necessary for me to communicate with them, in the specific
capacity I’m meant to communicate with that particular person. That
is to say, the concierge at the hotel speaks enough English to address
hotel situations, the person who brings me food speaks enough English
to sell me more food or bring me something else, the taxi drivers speak
enough English to find a destination, etc. They’re right, in a sense:
everyone does speak enough English that I can scrape by.</p>
<p>Now note that I said the amount of English absolutely necessary. My
colleagues at work are fluent; everyone else I encounter, it’s a bit
more dicey. It is by no means enough that I am comfortable or that I
understand what’s going on. It is barely enough for me to convince the
cabby who drove me here that no, he could not get away with dropping me
off at a metro station near my apartment building, I would have no way of
getting un-lost. It is not enough for me to explain any nuanced situation.</p>
<p>When someone asks “How many?” I can’t really respond “How many
can you bring?” When a situation happens, I had better stick to the
script. Except for, I don’t live in your country, I don’t know your
script, and I have no idea what’s normal. Can you help me figure out
what the script is? I didn’t know I had to pay for those water bottles
(approx US$1 or INR60 apiece).</p>
<p>And that’s what frustrates me most about the language barrier. I often
feel, even in the US, that I’m expected to follow a script that no one
gave me a copy of. When I go off script, people can get confused and
upset, and so if I detect there is a script and don’t know what it is,
my instinct is to ask for clarification rather than try to wing it. But
here, no one can understand my clarification, and if I go off script,
not only do people not understand (i.e. think I’m crazy), they don’t
understand (i.e. cannot process the unexpected flurry of English words
emerging from my mouth).</p>
<p>This would be a little less frustrating if my attempts to hire a
Hindi tutor were treated like reasonable behavior and not a quaint and
inexplicable desire that only an unreasonable person would have — or
else a gesture of amazing good will that I’m showing for some inexplicable
reason. Please, can we just look past the fact that I’m foreign and let
me on your secret code? I promise not to divulge the secrets!</p>
<p>I shall continue to report on this as the situation develops.</p>
<h1 id="food">Food</h1>
<p>I’ve always loved Indian food. Ever since I was a kid I have, even
though Gettysburg didn’t have an Indian restaurant — didn’t and still
doesn’t. The food is actually not that different from Indian food that you
might get in New York City, although its presentation and the attitudes
towards it are different.</p>
<p>Meat is very clearly indicated. Unlike in NYC, where vegetarian food
has a special mark or a special section, here there’ll be a special
“non‐vegetarian” section. The only meats around seem to be chicken
and mutton, and mutton I’ve only seen once. This explains maybe why so
many people at my office are “vegetarian except for chicken,” except
for it doesn’t because why is that even a thing? I suppose pork and beef
are both too religiously problematic, and I suppose we’re too inland
for fish, and I suppose lamb and mutton are expensive, and turkey’s an
American bird, etc. etc.</p>
<p>“Indian breakfast” turns out to mean a bunch of bread and a little
bit of yoghurt with an inconsequentially tiny but absolutely delicious
side of pickle relish of some sort. My attempt at an American breakfast
(eggs, toast, and hash browns) was disappointing in that it involved a
very small amount of egg (two eggs, but tiny ones) and white toast. Who
eats white toast? Does anyone like white toast? I guess it’s cheaper. I’m
feeling kind of spoiled now.</p>
<p>The food at the office is amazing and involves spicy falafel and
hummus among the buffet served for breakfast (at least the first day it
did). 5pm is samosa o’clock: everyone takes a half hour little break to
eat samosas. Unlike the New York office, Seamless isn’t a thing here. I
wonder if there’s really that much delivery food at all; it’s really hard
to tell. I’m getting the distinct impression that it’s a very narrow
slice of society I’m being exposed to as it stands. That is one thing
I definitely need to fix.</p>
<h1 id="work-culture">Work Culture</h1>
<p>Why does Jimmy feel awkward?</p>
<ul>
<li>A. Meeting lots of new people when I don’t really know anyone</li>
<li>B. Generally being an awkward turtle around new people</li>
<li>C. Not speaking the local language</li>
<li>D. Not being attuned to the local social norms and cues</li>
<li>E. Jet lag</li>
<li>F. All of the above</li>
</ul>
About Mehttps://www.thecodedmessage.com/about/0001-01-01T00:00:00+00:00Jimmy Hartzell Programmer, writer, opinionator
Name: Jimmy Hartzell Pronouns: He/Him Dogs or Cats: Cats Hair Style: Bald + Beard Nationality: United States Location: Pennsylvania (Philadelphia area) Preferred Programming Languages: Rust, Haskell, C++, C Resume: HTML, PDF GitHub: jhartzell42 Biography: Programmer, Personal Favorite Beer Style: Hefeweizen Usable Languages: English, German Languages I’ve Dabbled In: Spanish, Swedish, Japanese, Biblical Hebrew, Biblical Greek, Ecclesiastical Latin… Professional Skills Programming Languages Rust Haskell C++ C Python Swift Objective-C Bash Soft Skills (in a software development context) Teaching: Lecturing and Explaining Technical Concepts Mentoring Requirements Gathering Planning and Prioritizing Large Amounts of Work Dealing with Clients Hobbies Writing/Blogging (hi!<h1 id="jimmy-hartzell">Jimmy Hartzell</h1>
<p><img src="https://www.thecodedmessage.com/images/me.jpg" alt="Hello!"></p>
<p><em>Programmer, writer, opinionator</em></p>
<ul>
<li><strong>Name:</strong> Jimmy Hartzell</li>
<li><strong>Pronouns:</strong> He/Him</li>
<li><strong>Dogs or Cats:</strong> Cats</li>
<li><strong>Hair Style:</strong> Bald + Beard</li>
<li><strong>Nationality:</strong> United States</li>
<li><strong>Location:</strong> Pennsylvania (Philadelphia area)</li>
<li><strong>Preferred Programming Languages:</strong> Rust, Haskell, C++, C</li>
<li><strong>Resume:</strong> <a href="https://www.thecodedmessage.com/resume/">HTML</a>, <a href="https://www.thecodedmessage.com/resume.pdf">PDF</a></li>
<li><strong>GitHub:</strong> <a href="https://www.github.com/jhartzell42/">jhartzell42</a></li>
<li><strong>Biography:</strong> <a href="https://www.thecodedmessage.com/programmer-bio/">Programmer</a>, <a href="https://www.thecodedmessage.com/personal-bio/">Personal</a></li>
<li><strong>Favorite Beer Style:</strong> Hefeweizen</li>
<li><strong>Usable Languages:</strong> English, German</li>
<li><strong>Languages I’ve Dabbled In:</strong> Spanish, Swedish, Japanese, Biblical Hebrew, Biblical Greek, Ecclesiastical Latin…</li>
</ul>
<h1 id="professional-skills">Professional Skills</h1>
<h3 id="programming-languages">Programming Languages</h3>
<ul>
<li>Rust</li>
<li>Haskell</li>
<li>C++</li>
<li>C</li>
<li>Python</li>
<li>Swift</li>
<li>Objective-C</li>
<li>Bash</li>
</ul>
<h3 id="soft-skills-in-a-software-development-context">Soft Skills (in a software development context)</h3>
<ul>
<li>Teaching: Lecturing and Explaining Technical Concepts</li>
<li>Mentoring</li>
<li>Requirements Gathering</li>
<li>Planning and Prioritizing Large Amounts of Work</li>
<li>Dealing with Clients</li>
</ul>
<h1 id="hobbies">Hobbies</h1>
<ul>
<li>Writing/Blogging (hi!)
<ul>
<li>About <a href="https://www.thecodedmessage.com/tags/programming/">programming</a> and <a href="https://www.thecodedmessage.com/tags/nontechnical/">other topics</a></li>
<li>Even <a href="https://www.thecodedmessage.com/tags/fiction/">fiction</a> sometimes!</li>
<li>This is my biggest and most “productive” hobby</li>
</ul>
</li>
<li>Friends
<ul>
<li>Talking for hours and hours and hours on the phone with friends</li>
<li>Travelling with friends</li>
<li>Travelling to meet up with friends</li>
<li>Going out with friends</li>
<li>Going out to make friends</li>
<li>Throwing parties to invite friends</li>
<li>Board games with friends (I want more of this!)</li>
</ul>
</li>
<li>Nerding out about/learning more about:
<ul>
<li>Linguistics</li>
<li>History</li>
<li><a href="https://www.thecodedmessage.com/tags/religion/">Religious studies</a>, especially the <a href="https://en.wikipedia.org/wiki/Hebrew_Bible">Hebrew Bible</a></li>
<li>Germanic languages</li>
<li>Economics</li>
<li>Other nerdery (see the topics of this blog for details)</li>
</ul>
</li>
<li>Exercising (this should be bigger and higher up, but honestly it just isn’t):
<ul>
<li>Cycling</li>
<li>Rock climbing (relatively new, I’m still quite bad!)</li>
<li>Weight Training (lifting weights up and then putting them back down)</li>
</ul>
</li>
<li><a href="https://www.thecodedmessage.com/posts/my-organizational-system/">Keeping an overly meticulous TODO list</a>
<ul>
<li>Writing everthing I need to do on the list</li>
<li>Making sure I maintain the list well</li>
<li>Make sure I plan each day</li>
<li>Fail to do many of the things I planned
<ul>
<li>Especially errands and chores (they’re the worst)</li>
</ul>
</li>
<li>Many days: Write down the few things I did succeed at doing</li>
</ul>
</li>
<li>Music
<ul>
<li>Planning to get less rusty on piano</li>
<li>Seeing if I might get less rusty at singing</li>
<li>Considering getting less rusty on trombone</li>
<li>Annoying my friends playing recorder</li>
</ul>
</li>
</ul>
C++ Network Programming: Study Questions and Practice Projectshttps://www.thecodedmessage.com/programming-practice/0001-01-01T00:00:00+00:00C++ Study Questions What are some common examples of undefined behavior? What is memory corruption? What are some common causes of it? How can you get UB without memory corruption? How does UB interact with optimization? What is RAII? How does it differ from garbage collection? Why use smart pointers instead of raw pointers? What is the STL? What is it useful for? What are some common STL collections? What are some common STL algorithms?<h1 id="c-study-questions">C++ Study Questions</h1>
<ul>
<li>What are some common examples of undefined behavior?
<ul>
<li>What is memory corruption?
<ul>
<li>What are some common causes of it?</li>
</ul>
</li>
<li>How can you get UB without memory corruption?</li>
<li>How does UB interact with optimization?</li>
</ul>
</li>
<li>What is RAII? How does it differ from garbage collection?</li>
<li>Why use smart pointers instead of raw pointers?</li>
<li>What is the STL?
<ul>
<li>What is it useful for?</li>
<li>What are some common STL collections?</li>
<li>What are some common STL algorithms?</li>
<li>How do iterators work?</li>
<li>Why can’t you dereference the <code>end</code> iterator?</li>
</ul>
</li>
<li>What is the difference between “value semantics” and “reference semantics”?</li>
<li>What is “type erasure”?
<ul>
<li>How is <code>std::function</code> implemented?</li>
<li>How is <code>std::any</code> implemented?
<ul>
<li>What does <code>std::any</code> even do?</li>
</ul>
</li>
</ul>
</li>
<li>Why does an empty struct have a size of 1 and not 0?</li>
<li>What is <code>std::atomic</code>?
<ul>
<li>How does it differ from <code>volatile</code>?</li>
<li>How can it be optimized?</li>
</ul>
</li>
<li>What are the versions of the C++ standard?
<ul>
<li>What major features were added in each of them?</li>
</ul>
</li>
<li>What are the major C++ compilers?
<ul>
<li>Under what licensing terms are they available?</li>
</ul>
</li>
</ul>
<h2 id="c-performance">C++ Performance</h2>
<ul>
<li>What are some common compiler optimizations?
<ul>
<li>What are some optimizations a compiler cannot do?</li>
<li>What are downsides to having the compiler do optimizations?</li>
</ul>
</li>
<li>Why are virtual function calls slower?</li>
<li>What parts of a codebase need most optimizing?
<ul>
<li>What parts do not?</li>
</ul>
</li>
<li>Common compiler flags
<ul>
<li>What is the difference between <code>-O2</code>, <code>-O3</code> and <code>-Os</code>?</li>
<li>What is <code>-g</code>?
<ul>
<li>Why is it normally combined with <code>-O0</code>?
<ul>
<li>What does <code>-O0</code> do?</li>
<li>How is that different from what <code>-g</code> does?</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>What is the difference between throughput and latency?</li>
</ul>
<h1 id="operating-system-questions">Operating System Questions</h1>
<ul>
<li>What is the difference between the stack and the heap?</li>
<li>What is an operating system kernel?
<ul>
<li>What are some examples of operating system kernels?</li>
<li>What is “kernel” or “supervisor” mode?</li>
<li>What is a monolithic kernel vs a microkernel?</li>
<li>What is a driver?
<ul>
<li>Can they run in usermode?
<ul>
<li>When?</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>What is a system call?
<ul>
<li>What is the difference between a blocking and non-blocking system call?</li>
</ul>
</li>
<li>What is the difference between a process and a thread?
<ul>
<li>What are the performance implications?</li>
</ul>
</li>
<li>Describe different forms of IPC
<ul>
<li>What is the difference between brokered and non-brokered IPC?</li>
<li>What considerations should you take into account when choosing an IPC system?</li>
</ul>
</li>
<li>Describe virtual memory.
<ul>
<li>What is an address space?</li>
<li>What is memory protection?</li>
<li>What is paging?
<ul>
<li>What is a page fault?</li>
</ul>
</li>
<li>What does a kernel have to do to implement virtual memory?</li>
<li>How does a program allocate more memory on the stack?</li>
<li>How does a program allocate more memory on the heap?
<ul>
<li>In userspace?</li>
<li>What syscalls might it have to make?</li>
</ul>
</li>
<li>How do memory mapped files work?
<ul>
<li>When should you prefer this to <code>read</code> and <code>write</code> syscalls?</li>
</ul>
</li>
<li>How does shared memory work?
<ul>
<li>When should you prefer this to other forms of IPC?</li>
</ul>
</li>
<li>What is swap?</li>
<li>What is a TLB?
<ul>
<li>What performance implications does it have?</li>
</ul>
</li>
</ul>
</li>
</ul>
<h1 id="network-study-questions">Network Study Questions</h1>
<ul>
<li>Explain the layers of the OSI model
<ul>
<li>How do they map to Internet-based protocols?</li>
</ul>
</li>
<li>Explain IP basics
<ul>
<li>What is IP?</li>
<li>What is an IP address?
<ul>
<li>What is a subnet?</li>
<li>What is a non-routable IP address?</li>
<li>What is localhost?
<ul>
<li>When would we use it?</li>
</ul>
</li>
<li>What is NAT?
<ul>
<li>When is it used?</li>
</ul>
</li>
</ul>
</li>
<li>How does IP relate to link-level protocols?
<ul>
<li>Give some examples of link-level protocols.</li>
<li>Explain CSMA/CD.</li>
<li>What is MTU?</li>
<li>Explain Path MTU Discovery, and why it’s important.</li>
</ul>
</li>
<li>What is ICMP?
<ul>
<li>What are 2 command-line utilities that use it?</li>
<li>Why is it important?</li>
</ul>
</li>
<li>What is the difference between a hub, a switch, and a router?</li>
<li>What is the difference between IPv4 and IPv6?</li>
</ul>
</li>
<li>Explain DNS basics
<ul>
<li>What is DNS?</li>
<li>What are some types of DNS records?</li>
<li>On Unix
<ul>
<li>What is /etc/hosts?</li>
<li>What is /etc/resolv.conf?</li>
<li>What sort of things can go wrong if these are configured incorrectly?</li>
</ul>
</li>
<li>How do you access DNS from your applications?</li>
</ul>
</li>
<li>What is a VPN?</li>
<li>Explain the basics of TCP.
<ul>
<li>How does TCP implement reliability on top of IP?</li>
<li>TCP is stream-oriented, UDP is packet-oriented. What does this mean?</li>
<li>Explain the TCP three-way handshake.
<ul>
<li>What are SYN cookies?</li>
</ul>
</li>
<li>How do TCP acknowledgements work?
<ul>
<li>How does TCP handle the lack of negative acknowledgements?</li>
</ul>
</li>
<li>What is the TCP window size?</li>
</ul>
</li>
<li>Explain the basics of UDP.
<ul>
<li>What are some differences between TCP and UDP?</li>
<li>Why might you want to use UDP?</li>
<li>How are UDP broadcasts implemented?</li>
</ul>
</li>
<li>On Linux, what sort of things can you tune about networking using <code>sysctl</code>?
<ul>
<li>Why might you want to do this?</li>
</ul>
</li>
<li>Why are heartbeats important?
<ul>
<li>Why might they be implemented in user protocols?
<ul>
<li>Why is TCP keepalive not enough?</li>
</ul>
</li>
</ul>
</li>
</ul>
<h1 id="network-programming-practice-projects">Network Programming Practice Projects</h1>
<ul>
<li>Echo server
<ul>
<li>Can test by hand using <code>netcat</code></li>
<li>v1: Accepts single TCP connection, echos all inputs</li>
<li>v2: Use threading to accept multiple connections, echo each to itself</li>
<li>v3: Use single-thread
<ul>
<li><code>select</code> or <code>epoll</code> syscall</li>
<li><code>async</code> in Rust</li>
<li>Network reactor/dispatcher library in C++</li>
</ul>
</li>
</ul>
</li>
<li>Chat server
<ul>
<li>Accept multiple connections</li>
<li>Can test by hand using <code>netcat</code></li>
<li>v1: Send all whole lines (requires buffering) to all clients
<ul>
<li>No length restriction</li>
</ul>
</li>
<li>v2: Handshake to set username
<ul>
<li>Make still it still is testable using <code>netcat</code></li>
<li>Server sends username along with messages</li>
</ul>
</li>
</ul>
</li>
<li>File server
<ul>
<li>Simple HTTP-like protocol to specify filename of what file to send
<ul>
<li>Make sure you sanitize for inputs!</li>
</ul>
</li>
<li>v1: Serve one static file at a time</li>
<li>v2: Stream multiple files to multiple clients without using threads</li>
<li>v3: Support uploads
<ul>
<li>Write custom client</li>
</ul>
</li>
</ul>
</li>
<li>Exchange
<ul>
<li>v1: Just exchange connection
<ul>
<li>Keep order books
<ul>
<li>When new order comes in compatible with old order, make trade</li>
<li>One security at first, then multiple
<ul>
<li>But build multiple into protocol right away</li>
</ul>
</li>
</ul>
</li>
<li>TCP Protocol:
<ul>
<li>Create a custom protocol with CLI client</li>
<li>Client -> Server
<ul>
<li>Orders
<ul>
<li>Only GTC limit orders at first</li>
<li>Quantity</li>
<li>Security</li>
<li>Price</li>
<li>Buy/Sell</li>
<li>Order ID
<ul>
<li>Why is this useful?</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>Server -> Client
<ul>
<li>Order confirmation
<ul>
<li>All order information, echoed</li>
</ul>
</li>
<li>Trade confirmation (when orders match)
<ul>
<li>Also indicate whether client was maker or taker</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
</ul>
</li>
<li>Add-ons
<ul>
<li>Add heartbeats to protocol</li>
<li>Market data
<ul>
<li>UDP broadcast</li>
<li>Current state of order books</li>
</ul>
</li>
<li>Drop copy
<ul>
<li>Separate TCP records of all trades</li>
<li>Record trades in file before confirming</li>
<li>Allow replay of previous days’ trades</li>
</ul>
</li>
<li>Fees and kickbacks
<ul>
<li>Charge takers</li>
<li>Reimburse makers</li>
<li>Configurable on by-security basis</li>
</ul>
</li>
<li>Credit limits</li>
<li>Advanced order types
<ul>
<li>IOC/FOK</li>
<li>Maker-only orders</li>
</ul>
</li>
<li>FX-style last look</li>
<li>SSL support</li>
</ul>
</li>
</ul>
</li>
</ul>
Gardenshttps://www.thecodedmessage.com/gardens/0001-01-01T00:00:00+00:00Inspired by “The Garden and the Stream: A Technopastoral”, I have a few pages that are intended to grow over time, more like Wikipedia than Twitter, more like old-style webpages than a blog, more like books than newspapers or magazines, designed to be read in a slow trickle by people coming across my website, rather than booming across the Internet and then being nearly forgotten. I intend to increase how many of these I have over time.<p>Inspired by <a href="https://hapgood.us/2015/10/17/the-garden-and-the-stream-a-technopastoral/">“The Garden and the Stream: A
Technopastoral”</a>,
I have a few pages that are intended to grow over time, more like
Wikipedia than Twitter, more like old-style webpages than a blog, more
like books than newspapers or magazines, designed to be read in a slow
trickle by people coming across my website, rather than booming across
the Internet and then being nearly forgotten. I intend to increase how
many of these I have over time.</p>
<h1 id="programming">Programming</h1>
<ul>
<li><a href="https://www.thecodedmessage.com/rust-c-book/">Rust vs. C++ Comparison: an MDBook</a></li>
<li><a href="https://www.thecodedmessage.com/programming-portfolio/">Programming Portfolio</a></li>
<li><a href="https://www.thecodedmessage.com/programming-rec-reading/">Recommended Programming Reading</a></li>
<li><a href="https://www.thecodedmessage.com/rust-opinions/">Rust Opinions</a></li>
<li><a href="https://www.thecodedmessage.com/programming-practice/">C++ Network Programming: Study Questions and Practice/Learning Projects</a></li>
</ul>
<h1 id="not-programming">Not Programming</h1>
<ul>
<li><a href="https://www.thecodedmessage.com/about/">About Me</a></li>
<li><a href="https://www.thecodedmessage.com/resume/">Résumé</a> (<a href="https://www.thecodedmessage.com/resume.pdf">as PDF</a>)</li>
<li><a href="https://www.thecodedmessage.com/reading-log/">Reading Log: Books I’ve Read</a></li>
<li><a href="https://www.thecodedmessage.com/silly/">Humorous Thoughts</a></li>
</ul>
Humorous Thoughtshttps://www.thecodedmessage.com/silly/0001-01-01T00:00:00+00:00 According to the New Testament and some cladistically-inclined biologists, men are a type of fish. According to older norms of English usage and some curmudgeonly Bible translators, men encompasses all humans. <ul>
<li>According to the New Testament and some cladistically-inclined
biologists, men are a type of fish. According to older norms of English
usage and some curmudgeonly Bible translators, men encompasses all humans.</li>
</ul>
Licensehttps://www.thecodedmessage.com/license/0001-01-01T00:00:00+00:00All code on this blog is available under the MIT license:
Copyright 2017-2024 Jimmy Hartzell
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:<p>All code on this blog is available under the MIT license:</p>
<p>Copyright 2017-2024 Jimmy Hartzell</p>
<p>Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
“Software”), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:</p>
<p>The above copyright notice and this permission notice shall be included
in all copies or substantial portions of the Software.</p>
<p>THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR
THE USE OR OTHER DEALINGS IN THE SOFTWARE.</p>
Newsletterhttps://www.thecodedmessage.com/newsletter/0001-01-01T00:00:00+00:00Personal Biohttps://www.thecodedmessage.com/personal-bio/0001-01-01T00:00:00+00:00TL;DR I am an ADHD (hyperactive presentation) Pennsylvania Dutch millenial (1988) man, raised German Lutheran, (mostly) in small town Pennsylvania, in the United States.
I have family and friends but I’m not going to write about them in detail here because that feels too private.
Life Story As a pre-school boy, I enjoyed pretending I was various animals I’d learn about from cartoons and books, especially pretending that I was the grey squirrel from the song “Grey Squirrel, grey squirrel, swish your bushy tail,” or the brief phase where I tried to insist I was a macaroni penguin.<p><img src="https://www.thecodedmessage.com/images/me.jpg" alt="Me!"></p>
<h1 id="tldr">TL;DR</h1>
<p>I am an
<a href="https://en.wikipedia.org/wiki/Attention_deficit_hyperactivity_disorder">ADHD</a>
(hyperactive presentation) <a href="https://en.wikipedia.org/wiki/Pennsylvania_Dutch">Pennsylvania
Dutch</a> millenial
(1988) <a href="https://en.wikipedia.org/wiki/Man">man</a>, raised <a href="https://en.wikipedia.org/wiki/Lutheranism">German
Lutheran</a>, (mostly) in small
town <a href="https://en.wikipedia.org/wiki/Pennsylvania">Pennsylvania</a>, in the
<a href="https://en.wikipedia.org/wiki/United_States">United States</a>.</p>
<p>I have family and friends but I’m not going to write about them in detail
here because that feels too private.</p>
<h1 id="life-story">Life Story</h1>
<p>As a pre-school boy, I enjoyed pretending I was various animals I’d learn
about from cartoons and books, especially pretending that I was the
grey squirrel from the song “Grey Squirrel, grey squirrel, swish your
bushy tail,” or the brief phase where I tried to insist I was a <a href="https://en.wikipedia.org/wiki/Macaroni_penguin">macaroni
penguin</a>. I enjoyed
singing, and would often sing lullabies and children’s songs, or
Christmas carols even when they were not in season. My parents
could always entertain me or distract me easily with music.</p>
<p>As a young child, I was a fan of <em>Winnie the Pooh</em>. I enjoyed the books,
I enjoyed the TV show, and at one point, my favorite toys were plastic
Winnie the Pooh figures.</p>
<p>I was not a fan of <em>Power Rangers</em>, which my family did not watch. I did
not know why my fellow kindergartners liked it so much, but was convinced
they were objectively wrong somehow. I still don’t know anything about
<em>Power Rangers</em>, but I’m willing to believe it may have been good in
some ways if so many people liked it.</p>
<p>Being a neurodivergent child in the 90’s was super hard in a lot of ways,
and I don’t want to talk about any of them, so I won’t. I will however
mention that in kindergarten and first grade, I had a psychologist who
would play board games and video games with me – this was known as
“play therapy.” I did not know what this had to do with being a doctor,
but I wasn’t about to complain, especially because it was fun. We played
Commander Keene for MS-DOS. I remember being oddly impressed that he
had Windows 3.0 instead of Windows 3.1 on his computer – even though
it was a lower version number, it seemed oddly exotic. Why were there
so few Windows 3.0 computers? How did everyone else manage to upgrade?</p>
<p>In any case, I insisted on getting piano lessons in the first grade –
originally trying to get my mother to teach me, and then forcing her to
hire a teacher when my demands for instruction outpaced her capacity to
teach.</p>
<p>Throughout my youth, I enjoyed music, especially choral music (as a boy
soprano and then as a tenor). I played piano (from 1st grade), trombone
(from 4th grade), and recorder (I don’t know when I learned this).
It was recorder that taught me how to play by ear: I used to come
play Christmas carols during lunch in December, whether or not my
fellow schoolchildren wanted to hear them. I got musical education
both at school and at church, and appreciated this dual source of
instruction. Communal singing was most of the draw of going to
church.</p>
<p>In high school, I remained an avid reader, especially of science fiction
and fantasy. My favorite authors were J.R.R. Tolkien, C.S. Lewis,
Arthur C. Clarke and Robert Heinlein, approximately in that order. I did
debate team and marching band in high school as my most time-intensive
extra-curriculars. I enjoyed learning things both inside and outside
of my classes, but did not at all enjoy doing homework assignments,
and I resented adult authority over me, as is the way of teenagers.
I had an ordinary amount of friends, many of whom I’m still friends
with. My favorite topics in school were probably German and math. During
the last summers, I worked at the local college’s IT department.</p>
<p>I went to Cornell University in Ithaca, NY at 17, where I originally
hoped to dual-major in computer science and linguistics. Unfortunately,
it turns out dual majoring is hard, and so I did not end up successfully
doing that, just sticking to CS, officially – though I did take a number
of linguistics classes, as well as coming up one class short of a dual
major in math (minors were not offered in math).</p>
<p>At university, I made many friends, gained a lot of life experience, and
tried and failed to learn Japanese. I worked part-time at one point as
a system administrator in a college computer lab and at another point
as a short-order cook in a college dining hall. I also worked as an
undergraduate TA, designing better course software and teaching sections.</p>
<p>In 2010, I had a hiking accident and broke my back. I took a health leave
of absence from college, recovered, and then worked as a programmer in
NYC before returning to finish my bachelors in spring of 2013.</p>
<p>Besides my brief return to Cornell, I continued living in New York City.
During my 20s, the most important thing to me was cultivating and
maintaining close friendships, which I did, and traveling to as many fun
places as possible, which I also did. I lived in Brooklyn, and tried to
convince the hipsters I was one of them, in spite of my preference for
lagers (or even fancy cocktails) over IPAs. I was a regular at a few
bars and restaurants. I gradually adopted bicycle as one of my go-to
means of transit, especially CitiBike to get around Brooklyn.</p>
<p>Then, in January of 2022, I realized that programming jobs no longer
had physical locations and so I could live wherever I wanted. Now, I
live in an overly large house (or perhaps I’m an overly small person?)
in a different, slightly less small town in Pennsylvania. I don’t know
what the most important thing will be to me in my 30’s – ask me again
when I’m done with them.</p>
<h2 id="but-do-you-like-any-poems">But do you like any poems?</h2>
<blockquote>
<p><strong>Disobedience</strong><br>
<em>A.A. Milne</em></p>
<p>James James<br>
Morrison Morrison<br>
Weatherby George Dupree<br>
Took great<br>
Care of his Mother<br>
Though he was only three.<br>
James James<br>
Said to his Mother,<br>
“Mother,” he said, said he;<br>
“You must never go down to the end of the town, if<br>
you don’t go down with me.”</p>
<p>James James<br>
Morrison’s Mother<br>
Put on a golden gown,<br>
James James<br>
Morrison’s Mother<br>
Drove to the end of the town.<br>
James James<br>
Morrison’s Mother<br>
Said to herself, said she:<br>
“I can get right down to the end of the town and be<br>
back in time for tea.”</p>
<p>King John<br>
Put up a notice,<br>
“LOST or STOLEN or STRAYED!<br>
JAMES JAMES<br>
MORRISON’S MOTHER<br>
SEEMS TO HABE BEEN MISLAID.<br>
LAST SEEN<br>
WANDERING VAGUELY<br>
QUITE OF HER OWN ACCORD,<br>
SHE TRIED TO GET DOWN TO THE END OF<br>
THE TOWN - FORTY SHILLINGS REWARD!</p>
<p>James James<br>
Morrison Morrison<br>
(Commonly known as Jim)<br>
Told his<br>
Other relations<br>
Not to go blaming him.<br>
James James<br>
Said to his Mother,<br>
“Mother,” he said, said he,<br>
“You must never go down to the end of the town with-<br>
out consulting me.”</p>
<p>James James<br>
Morrison’s Mother<br>
Hasn’t been heard of since.<br>
King John<br>
Said he was sorry,<br>
So did the Queen and Prince.<br>
King John<br>
(Somebody told me)<br>
Said to a man he knew:<br>
“If people go down to the end of the town, well, what<br>
can anyone do?”</p>
<p>(Now then, very softly)<br>
J. J.<br>
M. M.<br>
W. G. du P.<br>
Took great<br>
C/o his M*****<br>
Though he was only 3.<br>
J. J.<br>
Said to his M*****<br>
“M*****,” he said, said he:<br>
“You-must-never-go-down-to-the-end-of-the-town-<br>
if-you-don’t-go-down-with-ME!”</p>
</blockquote>
Programmer Biohttps://www.thecodedmessage.com/programmer-bio/0001-01-01T00:00:00+00:00I learned to program as a child in the 90’s back in the era when PCs came with sample games like QBASIC Nibbles, with included code for hobbyists and learners to play around with. It was sheer luck that this random interest of mine turned out to be a marketable, employable skill.
My personal story plays into the narrative of the driven young autodidact, usually male, usually socially awkward, who has that “genius” to be a “real programmer”, perhaps at the expense of other qualities.<p>I learned to program as a child in the 90’s back in the era when PCs
came with sample games like <a href="https://www.thecodedmessage.com/posts/qbasic-nostalgia/"><code>QBASIC Nibbles</code></a>,
with included code for hobbyists and learners
to play around with. It was sheer luck that this random interest of mine
turned out to be a marketable, employable skill.</p>
<p>My personal story plays into the narrative of the
driven young autodidact, usually male, usually socially
awkward, who has that “genius” to be a <a href="http://www.catb.org/jargon/html/R/Real-Programmer.html">“real
programmer”</a>,
perhaps at the expense of <a href="https://www.smbc-comics.com/comic/p-humans">other
qualities</a>. But this is not
my take-away. Anyone can learn how to program, even if they don’t write
a line of code – or even touch a computer – until arbitrarily late in
life. And unlike other forms of math and logic, it is accessible even
to young children, and should be taught in schools alongside other subjects.</p>
<p>I was very fortunate that my uncle and godfather encouraged me in
programming from an early age, and also got me into Linux and open source
software. I quickly developed an interest in operating systems and systems
programming, running FreeBSD and trying to learn the Unix systems call API
and the differences between the Unix flavors. I still feel most at home
in systems programming languages, such as C; C++; and in our modern era,
<a href="https://www.rust-lang.org/">Rust</a> (but more on that later).</p>
<p>After an embarrassing period where I got deeply into dynamically-typed
OOP, Smalltalk, and Objective-C, I was introduced to functional
programming and static typing at my university’s <a href="https://cornellcswiki.gitlab.io/classes/CS3110.html">functional programming
course</a> (then
called CS312 and in SML), and I quickly understood the benefits.
My friends soon introduced me to <a href="https://www.haskell.org/">Haskell</a>,
which I digested with enthusiasm. It is still my favorite GC’d language
and applications programming language.</p>
<p>My career proceeded in the C++, systems side of things, as I worked as
a low-latency programmer and later as an instructor at
<a href="https://www.tower-research.com/">Tower Research Capital</a>, but I never
forgot my affection for Haskell and powerful static type systems.
I excitedly embraced “modern C++” and C++11, and tried my best to
use C++’s features to maximize safety and expressiveness.</p>
<p>My favorite part of my job at Tower was when I ran the technical training
course for new programming hires, which I did for multiple iterations
of that course. I covered topics like C++ template metaprogramming,
network protocol design, and low-latency coding techniques. I really
enjoyed instruction and got really good at explaining things and leading
classes. I really miss teaching.</p>
<p>After Tower, I wanted to avoid finance and low-latency programming
altogether, and took a job at <a href="https://obsidian.systems/">Obsidian</a>,
one of the largest Haskell consultancy shops, to work on a mix of
<a href="https://github.com/obsidiansystems/ledger-app-tezos">embedded C projects</a>
and Haskell projects in <a href="https://reflex-frp.org/">Reflex</a>, Obsidian’s
open-source framework for “Functional-Reactive Programming,” a
revolutionary up-and-coming paradigm for GUI programming.</p>
<p>But my interests in strong typing and in systems programming could
not fully be reconciled until I joined <a href="https://savantpower.com/">Savant Power</a>,
and discovered <a href="https://www.rust-lang.org/">Rust</a>.
As I learned more about Rust, it overcame my
<a href="https://www.thecodedmessage.com/posts/unsafe/">initial skepticism</a>, and it became clear to me that Rust
was finally achieving what modern C++ had been striving for for so long:
A high-performance systems programming language that was also type-safe,
ergonomic, and composable.</p>
<p>And that is where I stand to this day: I believe that Rust is C++ done
right.</p>
Programming Portfoliohttps://www.thecodedmessage.com/programming-portfolio/0001-01-01T00:00:00+00:00This is a page where I link code that I have written that is publicly available. Most of my professional work has been proprietary, and I have not been much of a hobbyist programmer over my career (though I’m trying to change that), so unfortunately most of my code doesn’t make it on here, but there is still some!
My GitHub is jhartzell42.
There is also code published on various articles on this blog.<p>This is a page where I link code that I have written that is publicly
available. Most of my professional work has been proprietary, and I have
not been much of a hobbyist programmer over my career (though I’m trying
to change that), so unfortunately most of my code doesn’t make it on here,
but there is still some!</p>
<p>My GitHub is <a href="https://github.com/jhartzell42"><code>jhartzell42</code></a>.</p>
<p>There is also code published on various articles on this blog. Any code
on this blog is covered under the <a href="https://www.thecodedmessage.com/license/">MIT license</a>.</p>
<h1 id="rust">Rust</h1>
<ul>
<li><a href="https://github.com/jhartzell42/holdem_rs"><code>holdem_rs</code></a> (hobby, 2024):
Some basic Texas Hold-Em (poker) code, we’ll see what happens with it.</li>
<li><a href="https://github.com/VorpalBlade/prefix-range/tree/main"><code>prefix-range</code></a> (hobby, 2023): Thank you <a href="https://vorpal.se">Arvid Norlander</a> for publishing my code from this <a href="https://www.thecodedmessage.com/posts/prefix-ranges/">blog
post</a> as a crate.</li>
<li><a href="https://gitlab.com/racepointenergy/rust_libraries/serde_dbus"><code>serde-dbus</code></a>
(professional, 2021-2022): This is a serialization
format for <a href="https://serde.rs/"><code>serde</code></a>
for the <code>DBus</code> messaging protocol’s <a href="https://dbus.freedesktop.org/doc/dbus-specification.html#message-protocol-marshaling">marshalling
format</a>.
It was necessary due to <a href="https://gitlab.freedesktop.org/dbus/zbus/-/issues/176">an
issue</a>
adapting the more common
<a href="https://gitlab.freedesktop.org/dbus/zbus/-/tree/main/zvariant"><code>zvariant</code></a>
to my employer’s specific needs. It currently only supports DBus messages
formatted in a “loosely typed” manner (with signature <code>a{sv}</code>) rather
than the strongly typed manner more commonly used with DBus, but this
suited the needs we had.</li>
<li>Pull requests accepted on open source projects:
<ul>
<li><a href="https://github.com/rustls/rustls/pull/1032"><code>rustls</code> support for querying by IP address</a></li>
<li><a href="https://github.com/meta-rust/cargo-bitbake/pull/34"><code>cargo-bitbake</code> reproducible mode</a></li>
<li><a href="https://gitlab.freedesktop.org/dbus/zbus/-/merge_requests/472"><code>zbus</code> can be used without <code>zvariant</code></a></li>
<li><a href="https://github.com/crossterm-rs/crossterm/pull/767"><code>crossterm</code> can be built without Windows deps in application <code>Cargo.lock</code></a></li>
</ul>
</li>
</ul>
<h1 id="c">C</h1>
<ul>
<li><a href="https://github.com/obsidiansystems/ledger-app-tezos/"><code>ledger-app-tezos</code></a>
(professional, mostly 2018): Although I am not personally a cryptocurrency
enthusiast, during my time working for <a href="https://obsidian.systems">Obsidian
Systems</a>, it was part of my job to
implement this app to support <a href="https://tezos.com/">Tezos</a> on the
<a href="https://www.ledger.com/">Ledger</a>. Though it was a group project,
I was the primary original author.
The original target platform, the Ledger Nano S, had only 4K of RAM.
I understand it has changed a lot since I originally worked on it.</li>
<li><a href="https://github.com/jhartzell42/compass"><code>compass</code></a> (hobby, 2009-2010): This was a
hobby project with a friend in college, and it is a bytecode interpreter
for a Smalltalk-like language in C, combined with a compiler in Python
and an “assembler” for the bytecode language also in C.</li>
</ul>
<h1 id="haskell">Haskell</h1>
<ul>
<li><a href="https://github.com/jhartzell42/reflex-chess"><code>reflex-chess</code></a>
(hobby, mostly 2019): This is a sample game written using
<a href="https://reflex-frp.org/">Reflex</a>. It only allows playing against
another local user. It has also been worked into the official
<a href="https://github.com/reflex-frp/reflex-examples"><code>reflex-examples</code></a> repo.</li>
<li><a href="https://github.com/jhartzell42/reflex-word-tiles"><code>reflex-word-tiles</code></a>
(hobby, 2022): A Wordle clone. Work in progress.</li>
<li>Pull requests accepted on open source projects:
<ul>
<li><a href="https://github.com/reflex-frp/reflex-dom/pull/358"><code>reflex-dom</code> bugfix</a></li>
<li><a href="https://github.com/reflex-frp/reflex-examples/pull/44"><code>reflex-examples</code> add chess</a></li>
</ul>
</li>
</ul>
Reading Loghttps://www.thecodedmessage.com/reading-log/0001-01-01T00:00:00+00:00These are books that I have finished, and links to reviews if I’ve written them.
March 2024 In Five Years, Rebecca Serle February 2024 Stop Walking on Eggshells, Paul T. Mason and Randi Kreger A River Enchanted, Rebecca Ross January 2024 The Galaxy and the Ground Within, Becky Chambers Dealing with Dragons, Patricia C Wrede I Hate You, Don’t Leave Me: Understanding the Borderline Personality, Jerold Kriesman, Hal Straus One Billion Americans: The Case for Thinking Bigger, Matthew Yglesias Mika in Real Life, Emiko Jean November 2023 Exiles from Earth, Ben Bova What is Real?<p>These are books that I have finished, and links to reviews if I’ve written them.</p>
<h2 id="march-2024">March 2024</h2>
<ul>
<li><em>In Five Years</em>, Rebecca Serle</li>
</ul>
<h2 id="february-2024">February 2024</h2>
<ul>
<li><em>Stop Walking on Eggshells</em>, Paul T. Mason and Randi Kreger</li>
<li><em>A River Enchanted</em>, Rebecca Ross</li>
</ul>
<h2 id="january-2024">January 2024</h2>
<ul>
<li><em>The Galaxy and the Ground Within</em>, Becky Chambers</li>
<li><em>Dealing with Dragons</em>, Patricia C Wrede</li>
<li><em>I Hate You, Don’t Leave Me: Understanding the Borderline Personality</em>, Jerold Kriesman, Hal Straus</li>
<li><a href="https://www.thecodedmessage.com/posts/billion-americans/"><em>One Billion Americans: The Case for Thinking Bigger</em></a>, Matthew Yglesias</li>
<li><em>Mika in Real Life</em>, Emiko Jean</li>
</ul>
<h2 id="november-2023">November 2023</h2>
<ul>
<li><em>Exiles from Earth</em>, Ben Bova</li>
<li><em>What is Real? The Unfinished Quest for the Meaning of Quantum Physics</em>,
Adam Becker</li>
<li><em>Children of Ruin</em>, Adrian Tchaikovsky</li>
<li><em>The Power</em>, Naomi Alderman</li>
</ul>
<h2 id="october-2023">October 2023</h2>
<ul>
<li><em>She Who Became the Sun</em>, Shelley Parker-Chan</li>
<li><em>Children of Time</em>, Adrian Tchaikovsky</li>
</ul>
<h2 id="september-2023">September 2023</h2>
<ul>
<li><em>Red, White, and Royal Blue</em>, Casey McQuiston</li>
<li><em>What is Anarchism? An Introduction</em>, Donald Rooum <em>et al.</em></li>
</ul>
<h2 id="august-2023">August 2023</h2>
<ul>
<li><em>Light from Uncommon Stars</em>, Ryka Aoki</li>
<li><em>Record of a Spaceborn Few</em>, Becky Chambers</li>
<li><em>A Closed and Common Orbit</em>, Becky Chambers</li>
<li><em>Madam</em>, Phoebe Wynn</li>
<li><em>The Terraformers</em>, Annalee Newitz</li>
</ul>
<h2 id="july-2023">July 2023</h2>
<ul>
<li><em>Polysecure: Attachment, Trauma, and Consensual Non-Monogamy</em>, Jessica Fern</li>
</ul>
<h2 id="june-2023">June 2023</h2>
<ul>
<li><a href="https://www.thecodedmessage.com/posts/long-way/"><em>The Long Way to a Small, Angry Planet</em></a>, Becky Chambers</li>
<li><em>Once There Were Wolves</em>, Charlotte McConaghy</li>
</ul>
<h2 id="march-2023">March 2023</h2>
<ul>
<li><em>Circe</em>, Madeline Miller</li>
<li><em>What If? 2</em>, Randall Munroe</li>
<li><em>Beartown</em>, Fredrik Backman, tr. Neil Smith</li>
<li><em>God Is Disappointed in You</em>, Mark Russell and Shannon Wheeler</li>
<li><em>Money: The True Story of a Made-Up Thing</em>, Jacob Goldstein</li>
<li><em>Understanding Government Finance</em>, Brian Romanchuk</li>
<li><em>Modern Monetary Theory and the Recovery</em>, Brian Romanchuk</li>
</ul>
<h2 id="february-2023">February 2023</h2>
<ul>
<li><em>Red Inferno: 1945</em>, Robert Conroy</li>
</ul>
<h2 id="january-2023">January 2023</h2>
<ul>
<li><em>God: A Biography</em>, Jack Miles</li>
<li><em>Rubinrot</em>, Kerstin Gier</li>
<li><em>Quiet</em>, Susan Cain</li>
<li><em>How the Bible Was Built</em>, Charles Merrill Smith and James W. Bennett</li>
</ul>
<h2 id="november-2022">November 2022</h2>
<ul>
<li><em>A Memory Called Empire</em>, Arkady Martine</li>
<li><em>A Desolation Called Peace</em>, Arkady Martine</li>
</ul>
<h2 id="october-2022">October 2022</h2>
<ul>
<li><em>Unfamiliar Fishes</em>, Sarah Vowell</li>
<li><em><a href="https://www.thecodedmessage.com/posts/atomic-habits/">Atomic Habits</a></em>, James Clear</li>
</ul>
<h2 id="september-2022">September 2022</h2>
<ul>
<li><em>The Lamplighters</em>, Emma Stonex</li>
<li><em>Hollow Kingdom</em>, Kira Jane Buxton</li>
<li><em>Dune Messiah</em>, Frank Herbert</li>
</ul>
<h2 id="august-2022">August 2022</h2>
<ul>
<li><em>Kaiju Preservation Society</em>, John Scalzi</li>
<li><em>Buffering: Unshared Tales of a Life Fully Loaded</em>, Hannah Hart</li>
</ul>
<h2 id="july-2022">July 2022</h2>
<ul>
<li><em>Cold Clay</em>, Juneau Black</li>
<li><em>Monk & Robot</em>, Becky Chambers
<ul>
<li><em>A Psalm for the Wild-Built</em></li>
<li><em>A Prayer for the Crown-Shy</em></li>
</ul>
</li>
<li><em>Will My Cat Eat My Eyeballs?</em>, Caitlin Doughty</li>
</ul>
<h2 id="june-2022">June 2022</h2>
<ul>
<li><em>Hyperion Cantos</em>, Dan Simmons
<ul>
<li>(Had previously read 1 & 2)</li>
<li><em>Endymion</em></li>
<li><em>The Rise of Endymion</em></li>
</ul>
</li>
<li><em>Shady Hollow</em>, Juneau Black</li>
</ul>
<h2 id="may-2022">May 2022</h2>
<ul>
<li><em>Old Man’s War</em>, John Scalzi (re-read)
<ul>
<li><em>Old Man’s War</em></li>
<li>“Questions for a Soldier”</li>
<li><em>The Ghost Brigades</em></li>
<li>“The Sagan Diary”</li>
<li><em>The Last Colony</em></li>
<li>“After the Coup”</li>
<li><em>Zoe’s Tale</em></li>
<li><em>The Human Division</em></li>
<li>“Hafte Sorvalh Eats a Churro and Speaks to the Youth of Today”</li>
<li><em>The End of All Things</em></li>
</ul>
</li>
</ul>
<h2 id="april-2022">April 2022</h2>
<ul>
<li><a href="https://www.thecodedmessage.com/posts/comic-beer/"><em>The Comic Book Story of Beer</em></a>, Aaron McConnell,
Jonathan Hennessey, and Mike Smith</li>
<li><a href="https://www.thecodedmessage.com/posts/review-plain-truth/"><em>Plain Truth</em></a>, Jodi Picoult</li>
</ul>
<p>There is no way I can record books read before this, so I shan’t try.</p>
Recommended Reading for Programmershttps://www.thecodedmessage.com/programming-rec-reading/0001-01-01T00:00:00+00:00Here are some resources I recommend for programmers. I am a systems programmer, thus my affection for Rust in particular and a general bias towards systems programming topics.
Recommended Rust Resources & Reading Here are a few links to materials that helped me get my bearings in Rust, and understand it deeper.
Many of these are bog-standard, and the same materials others would recommend. This is a good thing. Rust is an extremely well-polished programming language with an excellent community.<p>Here are some resources I recommend for programmers. I am a <a href="https://en.wikipedia.org/wiki/Systems_programming">systems
programmer</a>, thus
my affection for Rust in particular and a general bias towards systems
programming topics.</p>
<h1 id="recommended-rust-resources--reading">Recommended Rust Resources & Reading</h1>
<p>Here are a few links to materials that helped me get my bearings
in Rust, and understand it deeper.</p>
<p>Many of these are bog-standard, and the same materials others would
recommend. This is a good thing. Rust is an extremely well-polished
programming language with an excellent community. Trust them about
the idioms. Trust the programming language. If you don’t understand
why Rust made a certain decision, know that there is almost certainly
a well-considered, important reason, and that includes its choices of
commonly-recommended documentation.</p>
<h2 id="beginners">Beginners</h2>
<ul>
<li>Of course, <a href="https://doc.rust-lang.org/book/">The Book</a> is the
absolutely indispensable canonical reference for the programming
language.</li>
<li><a href="https://www.oreilly.com/library/view/programming-rust-2nd/9781492052586/">O’Reilly’s
book</a>
is also well worth reading</li>
<li>For those who like exercises,
<a href="https://github.com/rust-lang/rustlings">Rustlings</a> is incredibly well-done.</li>
<li>For those coming, like I did, from C and C++, <a href="http://cliffle.com/p/dangerust/">Learn Rust the Dangerous
Way</a> is a great resource.</li>
<li>Rust has a relatively small standard library. Some external dependencies
are practically standard, and have their own tutorials and
documentation. Among these
are <a href="https://tokio.rs/">Tokio</a>, <a href="https://serde.rs/">Serde</a>, and
<a href="https://docs.rs/log/latest/log/">log</a>.
<ul>
<li>There is <a href="https://blessed.rs/crates">a fuller list of “standard non-standard” crates</a></li>
</ul>
</li>
</ul>
<h2 id="intermediate">Intermediate</h2>
<ul>
<li><a href="https://nostarch.com/rust-rustaceans">Rust for Rustaceans</a> is a nice
“second semester” course in Rust, covering all the things that every
advanced Rust programmer really should know, and no longer has to learn
the hard way. This contained a lot of especially useful information for
the serious software engineering and dependency management considerations
involved in maintaining a library and publishing a crate.</li>
<li><a href="https://fasterthanli.me/">Faster than Lime</a>, as far as I can tell,
straddles beginner and intermediate.</li>
</ul>
<h2 id="unsafe-rust">Unsafe Rust</h2>
<ul>
<li>There are a lot of misunderstandings about <code>unsafe</code> in Rust, but most
can be cleared up by reading the
<a href="https://doc.rust-lang.org/nomicon/">Rustonomicon</a>. Even if you don’t
personally have occasion to use <code>unsafe</code>, it is an essential part of
the language, and the crates that you depend on – whose source code
you should be reading – will use it.</li>
<li><a href="https://rust-unofficial.github.io/too-many-lists/">Learn Rust With Entirely Too Many Linked
Lists</a> is a project
to teach Rust by programming the data structure it is perhaps least
well-suited to.</li>
<li>Ralf Jung on undefined behavior:
<ul>
<li><a href="https://www.ralfj.de/blog/2021/11/18/ub-good-idea.html">Undefined Behavior Deserves a Better Reputation</a></li>
<li><a href="https://www.ralfj.de/blog/2021/11/24/ub-necessary.html">Do we really need Undefined Behavior?</a></li>
</ul>
</li>
<li>Ralf Jung’s series on pointers/memory models:
<ul>
<li><a href="https://www.ralfj.de/blog/2018/07/24/pointers-and-bytes.html">Pointers are Complicated, or: What’s in a byte?</a></li>
<li><a href="https://www.ralfj.de/blog/2020/12/14/provenance.html">Pointers are Complicated II, or: We need better language specs</a></li>
<li><a href="https://www.ralfj.de/blog/2022/04/11/provenance-exposed.html">Pointers are Complicated III, or: Pointer-integer casts exposed</a></li>
</ul>
</li>
</ul>
<h2 id="articles">Articles</h2>
<ul>
<li><a href="http://cliffle.com/blog/rust-typestate/">Typestate Pattern in Rust</a></li>
<li>An introduction to <a href="https://oxide.computer/">Oxide</a>’s new operating
system and debugger, <a href="http://cliffle.com/blog/on-hubris-and-humility/">Hubris and Humility</a></li>
<li>Rust can help with <a href="https://aws.amazon.com/blogs/opensource/sustainability-with-rust/">the environment</a> as well</li>
<li>Not Rust, but systems programming relevant: <a href="http://www.catb.org/esr/structure-packing/">structure packing</a></li>
<li><a href="https://discord.com/blog/why-discord-is-switching-from-go-to-rust">Why Discord is switching from Go to Rust</a></li>
<li><a href="https://fasterthanli.me/articles/i-want-off-mr-golangs-wild-ride">A thorough treatise on why Go is bad</a></li>
</ul>
<h1 id="other-programming-reading">Other Programming Reading</h1>
<h2 id="essential-resources">Essential resources</h2>
<ul>
<li>For C:
<ul>
<li><a href="http://c-faq.com/">The (Usenet) C FAQ</a></li>
<li><a href="https://www.amazon.com/Programming-Language-2nd-Brian-Kernighan/dp/0131103628">The C Programming Language</a></li>
</ul>
</li>
<li>For C++: Scott Meyers <a href="https://www.aristeia.com/books.html">Effective C++ series</a></li>
<li>For OS: <a href="https://www.oreilly.com/library/view/design-and-implementation/9780133761825/">Design and Implementation of the FreeBSD Operating System</a></li>
<li>General-purpose introduction
<ul>
<li>I’ve not read <a href="https://dcic-world.org/2022-01-25/index.html">A Data-Centric Introduction to Computing</a>, but I understand it’s very good</li>
</ul>
</li>
</ul>
<h2 id="articles-1">Articles</h2>
<ul>
<li><a href="https://www.hillelwayne.com/">Hillel Wayne</a>’s
<a href="https://www.hillelwayne.com/post/are-we-really-engineers/">three</a>
<a href="https://www.hillelwayne.com/post/we-are-not-special/">part</a>
<a href="https://www.hillelwayne.com/post/what-we-can-learn/">series</a>
on how software engineers truly are engineers.</li>
<li>Hillel Wayne also has good things to say about why
<a href="https://www.hillelwayne.com/post/what-comments/">comments</a>
are <a href="https://buttondown.email/hillelwayne/archive/comment-the-why-and-the-what/">good actually</a>, for not only “why” but also “what.”</li>
</ul>
<h2 id="classics-for-the-vibes">Classics, for the vibes</h2>
<ul>
<li><a href="http://project.cyberpunk.ru/lib/in_the_beginning_was_the_command_line/">In the beginning was the command line</a> – please note that this is a book.</li>
<li><a href="https://www.dreamsongs.com/RiseOfWorseIsBetter.html">The Rise of Worse is Better</a> – please note that I am literally trying to unseat C++, and I am not an advocate of the Worse is Better philosophy</li>
<li><a href="https://web.mit.edu/~simsong/www/ugh.pdf">The UNIX-HATERS Handbook</a></li>
<li><a href="http://www.catb.org/jargon/html/">The Jargon File</a></li>
<li><a href="https://yosefk.com/c++fqa/">C++ Frequently Questioned Answers</a></li>
</ul>
Résuméhttps://www.thecodedmessage.com/resume/0001-01-01T00:00:00+00:00Jimmy Hartzell: Systems Programmer Phone: 646-334-9882, Email: jah259@cornell.edu, Website: https://www.thecodedmessage.com/
Skills Programming languages: Rust, C++, C, Haskell, Swift, Python, Objective-C, Bash, x86 assembly (32 and 64 bit) Technologies: Linux systems/low-latency network programming, Tokio, Reflex FRP, Yocto, AWS, Ledger Nano S, Redis, C++ template metaprogramming Career Experience Amtrak: July 2023-Present, Senior Principal Software Engineer Technologies: C++, HP NonStop Developed simulator for ITCS Positive Train Control protocol Fixed bugs in HP NonStop dispatching codebase Savant Systems: May 2021-June 2023, Senior Embedded Linux Software Developer Technologies: Rust (incl.<h1 id="jimmy-hartzell-systems-programmerhttpswwwthecodedmessagecomresumepdf"><a href="https://www.thecodedmessage.com/resume.pdf">Jimmy Hartzell: Systems Programmer</a></h1>
<p><strong>Phone</strong>: 646-334-9882,
<strong>Email</strong>: <a href="mailto:jah259@cornell.edu">jah259@cornell.edu</a>,
<strong>Website</strong>: <a href="https://www.thecodedmessage.com/">https://www.thecodedmessage.com/</a></p>
<h2 id="skills">Skills</h2>
<ul>
<li><strong>Programming languages</strong>: Rust, C++, C, Haskell, Swift, Python, Objective-C, Bash, x86 assembly (32 and 64 bit)</li>
<li><strong>Technologies</strong>: Linux systems/low-latency network programming,
Tokio, Reflex FRP, Yocto, AWS, Ledger Nano S, Redis, C++ template
metaprogramming</li>
</ul>
<h2 id="career-experience">Career Experience</h2>
<ul>
<li><strong>Amtrak:</strong> July 2023-Present, <em>Senior Principal Software Engineer</em>
<ul>
<li><strong>Technologies:</strong> C++, HP NonStop</li>
<li>Developed simulator for ITCS Positive Train Control protocol</li>
<li>Fixed bugs in HP NonStop dispatching codebase</li>
</ul>
</li>
<li><strong>Savant Systems:</strong> May 2021-June 2023, <em>Senior Embedded Linux Software Developer</em>
<ul>
<li><strong>Technologies:</strong> Rust (incl. Tokio), Yocto, Swift, Objective-C, Redis</li>
<li>Wrote usermode Rust driver for Atmel energy meter</li>
<li>Adapted quickly to a decades-old Objective-C codebase</li>
<li>Developed and implemented migration plans for core components of
system architecture</li>
<li>Rewrote Swift microservices and frameworks into Rust</li>
<li>Added caching layers around accesses to legacy key-value store,
and implemented bidirectional synchronization between it and Redis</li>
</ul>
</li>
<li><strong>Obsidian Systems:</strong> March 2018-May 2021, <em>Software Development Consultant</em>
<ul>
<li><strong>Technologies</strong>: Haskell, Reflex FRP, C, Ledger Nano S, Nix, C++</li>
<li>Full-stack Haskell application development</li>
<li>Worked with a variety of clients, with diverse corporate
culture and organizational systems</li>
<li>Worked on Incremental View, a database research project for incremental
queries on Postgres</li>
<li>Wrote apps in embedded C on Ledger Nano S (a platform w/ 4K of RAM)</li>
<li>Refactored overengineered client C++ codebases</li>
<li>Did trainings and talks on C++, Rust, blockchain, and Haskell</li>
</ul>
</li>
<li><strong>Tower Research:</strong> June 2013-March 2018, <em>Senior Software Developer</em>
<ul>
<li><strong>Technologies</strong>: C++ (C++11, C++14), C++ template metaprogramming, Linux systems programming, <code>clang-format</code>, <code>valgrind</code>, <code>gdb</code>, FIX protocol, Intel64 assembly</li>
<li>Risk platform, C++ development (2017-2018):
<ul>
<li>Wrote a new high-performance logging system</li>
<li>Led a small team to add new trade reconciliation systems to
comply with EU regulations</li>
</ul>
</li>
<li>Lead training instructor (2016-2018):
<ul>
<li>Developed and taught full-time C++, networking, systems, and low-latency programming
programming curriculum for new hires in US and India</li>
<li>Trained and mentored other instructors</li>
</ul>
</li>
<li>FX trading desk, C++ development (2013-2016):
<ul>
<li>Mentorship: First line of defense for team member questions</li>
<li>Continuously made latency improvements for market data handlers</li>
<li>Developed new aggregator project to aggregate internal liquidity</li>
<li>Owned support for FX “last look” feature</li>
<li>Wrote/maintained handlers for many financial protocols</li>
</ul>
</li>
</ul>
</li>
<li><strong>Moat:</strong> Feb 2011-March 2013, <em>Infrastructure Developer</em>
<ul>
<li><strong>Technologies</strong>: Python, C++, Bash, AWS, S3</li>
<li>Led a 3-member team to develop server discovery and deployment scripts</li>
<li>Scalable bloom filter implementation in C++</li>
</ul>
</li>
</ul>
<h2 id="education">Education</h2>
<ul>
<li><strong>Cornell University</strong>: <em>Bachelors in Computer Science</em></li>
</ul>
Rust Opinionshttps://www.thecodedmessage.com/rust-opinions/0001-01-01T00:00:00+00:00Rust Style Guidelines cargo fmt is your friend. I use the default settings. clippy is a great tool, and should be a requirement for getting PRs merged. unsafe is fine when called for, but use carefully It should be commented It should be wrapped in a safe abstraction Error Handling Panicking always indicates a bug Especially in a library Also in an application Doesn’t mean you can’t call panic But if that code path is actually activated, it’s a bug Use ?<h1 id="rust-style-guidelines">Rust Style Guidelines</h1>
<ul>
<li><code>cargo fmt</code> is your friend. I use the default settings.</li>
<li><code>clippy</code> is a great tool, and should be a requirement for getting PRs merged.</li>
<li><code>unsafe</code> is fine when called for, but use carefully
<ul>
<li>It should be commented</li>
<li>It should be wrapped in a safe abstraction</li>
</ul>
</li>
</ul>
<h3 id="error-handling">Error Handling</h3>
<ul>
<li>Panicking always indicates a bug
<ul>
<li>Especially in a library</li>
<li>Also in an application</li>
<li>Doesn’t mean you can’t call <code>panic</code>
<ul>
<li>But if that code path is actually activated, it’s a bug</li>
</ul>
</li>
</ul>
</li>
<li>Use <code>?</code> even in toy projects</li>
<li><a href="https://www.thecodedmessage.com/posts/2022-07-14-programming-unwrap/"><code>unwrap</code></a> does not belong in your repos. <code>expect</code> is acceptable.
<ul>
<li>It is too prone to abuse. <code>clippy</code> should be configured to ban it.</li>
<li>Use <code>expect</code> where <code>unwrap</code> <em>would</em> be appropriate
<ul>
<li><code>expect</code> is appropriate for indicating logic errors
<ul>
<li>A panic is always a bug, especially in a library</li>
<li><code>unlock()</code> is an appropriate place to use <code>expect</code></li>
</ul>
</li>
<li><code>expect</code> is not appropriate as a band-aid on bad flow control
<ul>
<li>Use <code>if let Some(x) =</code> rather than <code>is_some</code> followed by <code>unwrap</code></li>
</ul>
</li>
<li><code>?</code> should be preferred in most non-logic error situations
<ul>
<li>Or explicit handling</li>
</ul>
</li>
<li>Write a useful message with <code>expect</code>
<ul>
<li>To aid debugging if you were wrong</li>
<li>To show to the reader why you think it will never be called</li>
</ul>
</li>
<li>Consider writing panicking functions over calling <code>expect</code> repeatedly
<ul>
<li>Array indexes are a good example of this</li>
</ul>
</li>
</ul>
</li>
<li>This is controversial
<ul>
<li>Some experts think <code>unwrap</code> is OK in some situations where I allow <code>expect</code>
<ul>
<li>I think <code>unwrap</code> is too tempting for the situations where neither <code>expect</code> or <code>unwrap</code> is OK</li>
</ul>
</li>
<li>No experts think <code>unwrap</code> is fine to use much more liberally than that</li>
</ul>
</li>
<li><code>unwrap</code> is OK in draft code where it is interpreted as a <code>XXX</code>
<ul>
<li>Don’t let this into your production codebase</li>
<li>Make sure you have the <code>unwrap</code> check in <code>clippy</code> enabled in CI so it doesn’t slip in</li>
</ul>
</li>
</ul>
</li>
<li>Use <a href="https://crates.io/crates/thiserror"><code>thiserror</code></a> for libraries,
use <a href="https://docs.rs/anyhow/latest/anyhow/"><code>anyhow</code></a> or <a href="https://crates.io/crates/eyre"><code>eyre</code></a> for binaries.</li>
</ul>
<h1 id="things-i-love-about-rust">Things I love about Rust</h1>
<h3 id="the-programming-language-itself">The Programming Language Itself</h3>
<ul>
<li>That it has proper sum types in <code>enum</code>s</li>
<li>That it has all the expressive power of C++ if you need it, while demarcating a safe subset</li>
<li>That Rust lifetimes have finally rescued “regions” from academia
<ul>
<li>And therefore made <a href="https://www.thecodedmessage.com/posts/raii">RAII complete</a></li>
</ul>
</li>
<li>That it doesn’t have OOP-style inheritance</li>
</ul>
<h3 id="standard-library">Standard Library</h3>
<ul>
<li><a href="https://www.thecodedmessage.com/posts/rust-map-entry/">The <code>map</code> APIs</a> are quite excellent</li>
</ul>
<h3 id="the-ecosystem">The Ecosystem</h3>
<ul>
<li>That everyone uses <code>log</code> but there are multiple backends available</li>
<li>That <code>cargo</code> is so ergonomic</li>
</ul>
<h1 id="how-i-wish-rust-had-been-made-differently-but-its-too-late-to-change">How I wish Rust had been made differently (but it’s too late to change)</h1>
<ul>
<li>I wish <code>unwrap</code> were not part of the standard library</li>
<li>I wish the C/C++ distinction between <code>.</code> and <code>-></code> were retained,
because I think sometimes automatically dereferencing and sometimes not
is surprising.</li>
</ul>