Polymorphism is a powerful programming language feature. In polymorphism, we have generic functions that don’t know exactly what type of data they will be operating on. Often, the data types won’t even all have been designed yet when the generic function is written. The generic function provides the general outline of the work, but the details of some parts of the work, some specific operations, must be tailored to the specific types being used. The generic code needs some way of accessing these specific operations, and the users of the generic code need some way of specifying them.

There are many use cases for polymorphism. When sorting an array, the algorithm will need to be adapted to the specific element type, so it knows how to compare elements. When drawing virtual objects on a screen, an algorithm might choose where to put each object and which objects to draw, whereas each type of object might have its own specialized implementation of how to draw it.

These are just two examples among many. Most complicated projects have many polymorphic functions. Even in languages that don’t support polymorphism directly, there are usually ways of building it out of existing primitives.

The example I’ve chosen is sorting, specifically sorting an array or vector. It’s just an example; a lot of what I say applies generally to how polymorphism works in that programming language.

This is a good example, as sorting is a function where it’s really obvious where polymorphism is required to get a properly generalizable algorithm. A lot of discussions of polymorphism invent contrived situations where polymorphism seems overkill, and I think that’s fundamentally confusing.

On the other hand, it’s a bad example in some ways, because it only makes sense in the context of a homogeneous array or list, where every element is the same type. This is a bad example because heterogeneous containers, where every element has a different type and the polymorphic function has to look up as many function implementations as there are elements, provides a very different set of problems to solve.

This is especially important as Rust and C++ both provide two types of polymorphism, compile-time and run-time, also known as static and dynamic. The question of which to use is complicated, but for sorting, compile-time or static polymorphism is clearly the appropriate choice, with run-time or dynamic polymorphism feeling very awkward and forced. Heterogeneous containers generally must use some form of dynamic polymorphism (whether through virtual functions in C++ or through type erasure).

So, while I think this example will be illustrative, it won’t allow us to explore run-time, dynamic polymorphism on its home turf, if you will. Hopefully, I can make up this deficit in future blog posts.

Sorting: A Polymorphic Function#

Sorting algorithms are a true use case for polymorphism: rather than distinguishing between a small set of options, many types support the operations necessary for sorting. The algorithm is agnostic to the implementation of those operations. Quick sort, insertion sort, and merge sort apply equally well to sorting integers, floating point values, or alphabetizing strings – any algorithm can be combined freely with any type, or at least any type for which a concept of “ordering” exists.

Here are the operations or properties (or dare I say, traits) that a type needs to be sortable, and that a generic sorting algorithm might need to find out about. The first one is obvious to OOP programmers, but the other two more subtle, and implied in many OOP programming languages:

  • Ordering or comparison: Given two values a and b, this operation answers which is greater, or determines that they are equal. Some types have the additional possibility that they are incomparable – arrays of those types cannot be sorted by most algorithms.
  • Swapping or moving: The data has to be able to be moved around to turn the unsorted array into a sorted one. This is automatic in many OOP languages for object types due to ubiquitous use of indirection. It is also automatic in Rust, where every type can be moved by just copying all the bytes.
  • Striding the array or size: Given a pointer to one element, how do you get to the next one? By how many bytes must you increment the pointer? Most sorting algorithms require this to be constant. If you use indirection for the values, this is also trivial. If you do not, it is key information.

These operations – or more generally, traits of a type – can then be combined with a sorting algorithm to create a concrete procedure to sort an array for a given concrete type.

So let’s see how various programming languages handle this.

Programming Language #0: Sorting in C#

I will start our tour of programming languages with C. C – the non-OOP, non-C++ programming language; the classic “portable assembly language” from 1972 – doesn’t have many polymorphic algorithms, algorithms that accept any type, because you have to implement polymorphism by hand. But sorting is an important enough one that standard C does have a generic sorting function: qsort for quicksort (and on many systems, heapsort and mergesort` are also avaialble). Because polymorphism is implemented by hand, we can look at this function to see how one might specifically tailor polymorphism to the problem of sorting.

Here is the function signature for qsort:

void qsort(void *base, size_t nmemb, size_t size,
           int (*compar)(const void *, const void *));

It can be used to sort blocks of memory containing a sequence of integers, foating point values, or (pointers to) strings – any comparable and (trivially) movable fixed-size type.

C function signatures can be hard to read, so I’ll break it down argument by argument:

  • void *base: This is an untyped pointer (void *) to the beginning of the block of memory to be sorted.
  • size_t nmemb: This is a bound, how much memory is contained in the block of memory. C often represents aggregates by two values, base and a count of the members.
  • size_t size: How big is each member? On a typical 64-bit system, an int is 4 bytes, a double is 8, and char * for strings are 8 bytes. Custom types might be any size. qsort should work for all of these types, without indirection.
  • int (*compar)(const void *, const void *): This is the interesting part. This is a function pointer for the comparison operation as discussed above. You write a function that takes two pointers to two elements, and returns a value that encodes their relationship.

Swapping is assumed to be byte-by-byte, and so size covers the last two attributes of the type listed above. The key one here is compar, a bit of code that qsort has to call to do an operation specific to your type, a small policy injection that adapts a generic algorithm to your particular type.

The return value of compar is an int, but it is interpreted according to a C convention, shared with (for example) the string comparison function strcmp. For a ? b, a return value r is interpreted thus:

  • if r < 0, a < b
  • if r > 0, a > b
  • if r == 0, a == b

So, here’s a complete C program that sorts its command line arguments – including the program name:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int compare_strings(const void *a, const void *b) {
    // `a` and `b` are pointers to the element type, which in
    // this case is `char *`. Thus they are `char **`.
    //
    // Nothing is stopping you from getting this wrong and putting
    // `char *` instead -- it will just silently not work. The
    // compiler can and will make you write `const` in the right
    // place, though.

    char * const* a_str_ptr = a;
    char * const* b_str_ptr = b;

    // `strcmp` uses the same convention as `qsort` for comparison.
    return strcmp(*a_str_ptr, *b_str_ptr);
}

int main(int argc, char **argv) {
    qsort(argv, argc, sizeof(char *), &compare_strings);

    for (int i = 0; i < argc; i++) {
        printf("%s\n", argv[i]);
    }

    return 0;
}

But the same qsort function can also be used to sort integers, if given different parameters and a different comparison function:

#include <stdlib.h>
#include <stdio.h>

int compare_ints(const void *a_vp, const void *b_vp) {
    const int *a_ip = a_vp;
    const int *b_ip = b_vp;

    int a = *a_ip;
    int b = *b_ip;

    if (a < b) {
        return -1;
    } else if (a == b) {
        return 0;
    } else { // a > b
        return 1;
    }
}

int main(int argc, char **argv) {
    int intary[10] = { 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 };
    qsort(intary, 10, sizeof(int), &compare_ints);

    for (int i = 0; i < 10; i++) {
        printf("%d\n", intary[i]);
    }

    return 0;
}

qsort implements a form of manual run-time polymorphism, in a programming language with no built-in support for polymorphism. It behaves differently based on the element type, as passed to it via a variety of arguments. One of the traits – the comparison operator – differs between types in a way that requires custom code, and this is passed in via pointer. qsort then invokes the operation via indirect function call, the same mechanism that is used for polymorphism in OOP. But unlike OOP-style runtime polymorphism, there is just one function pointer for all the items, rather than each item coming with its own “vtable.”

Note that the optimizer is not able to eliminate this indirect call, especially in the qsort example, where the sorting function is in the standard library, whereas the function calling it and the comparison function are both in application code. This comes at a performance cost, which means that if you’re programming C and the performance of this particular sort is essential to your program, it might easily make sense to write custom sorting code that is not polymorphic.

Programming Language #1: Sorting in Java#

Java is about as far from C as you can get in this matter. C provides no abstraction or language features specifically for polymorphism, and in qsort we use a low-level tool it does provide – function pointers – to build it ourselves. In Java, however, the programming language is explicitly object-oriented, and so the whole programming language is designed to encourage you to leverage polymorphism, as that is one of the pillars of object-oriented programming.

The version of polymorphism available in Java is dynamic, run-time, “late binding” polymorphism, the type of polymorphism that OOP favors. It is based off of the idea of overriding methods, either from base classes, or interfaces that a custom type (a “class”) can implement.

As I mentioned before, this is not the best match for the problem of sorting, at least not the type of sorting we’re talking about. Run-time polymorphism means that every individual element could potentially have a different comparison procedure, which is unlikely. The possibility of such a thing happen increases the cognitive load.

Nevertheless, Java does support polymorphic sorting, and it’s useful to discuss specifically because it does show how OOP-style polymorphism works when applied to such a problem.

There are many methods that do sorting in Java. Some of them take an explicit argument to convey how to do comparisons, just like the qsort example. But more commonly, we sort according to what Java refers to as the “natural order” of the elements, as (for example) in this overload of Collections.sort, with the following signature:

public static <T extends Comparable<? super T>>
void sort(List<T> list)

This sorts a list of elements of type T, where “list” in Java can refer to any of a number of collections that store data in order, such as in a single allocated array (ArrayList) or a linked list (LinkedList). Therefore, it is not only polymorphic in how to compare the elements, but also in how to navigate through the list.

It needs to know about the same traits of type T that qsort does. Some are not polymorphic: for this method to make sense, we know that T must be a reference type, that it must be boxed (that is, it must use indirection), and that therefore the size of an element is always the natural pointer size of the platform, and swapping the element only involves swapping the pointers.

But there’s no getting around the polymorphism of comparisons, and so we see this strange annotation on the function signature: <T extends Comparable<? super T>>. This indicates that T must implement the interface Comparable – implement in this context is called extends. Specifically, it must implement that interface in such a way that it can be applied to other elements of type T (which means that it uses T or some “supertype” of T).

The notation is complicated, because the semantics are complicated. Technically, T could be comparable to a parent type of T, and that would still work. In fact, T could refer to an entire class hierarchy of types derived from some base class, all of them comparable in different ways to objects elsewhere in the hierarchy and to objects derived from a yet further base class. Objects of type T could even be comparable to any arbitrary object – and all of this is covered in <T extends Comparable<? superT>>, trying to express at compile-time what will cause the type T to be a reasonable type to use for sorting.

But this is all just an extra check that the compiler can do at compile-time to prevent run-time errors, because all of the information on how to do the comparisons is available at run-time. In fact, other methods don’t use such formal prerequisites at all, preferring to query at run-time for appropriate interfaces, throwing an exception if they are not present.

In all of these cases, the comparison is the “natural ordering,” which is defined to mean that comparison is done through a Java interface. Specifically, these methods use the Comparable interface, which specifies a method, compareTo, which must take an implicit this parameter and an explicit parameter of the type being compared to, and, like the comparison functions in qsort, must then return an integer whose sign indicates whether the first value was greater or the second (with zero indicating equality).

This natural ordering is defined on a per-type basis. Each type can only implement Comparable once. Fortunately, the regular built-in types, all the ones we are likely to use, all come with good natural orderings. For example, this code all works:

import java.util.*;

public class Sort {
    public static void main(String[] args) {
        List<String> argList = Arrays.asList(args);
        Collections.sort(argList);
        for (String arg : argList) {
            System.out.println(arg);
        }

        List<Integer> list = new ArrayList<Integer>();
        list.add(1);
        list.add(3);
        list.add(2);
        list.add(4);
        Collections.sort(list);
        for (int i : list) {
            System.out.println(i);
        }
    }
}

See it in use:

$ java Sort b c a
a
b
c
1
2
3
4
$

It gets a little less coherent when we mix different types of object in the same list, which Java lets us represent in the type system by using Object, which is a type that can store a reference to any non-primitive (including boxed primitives):

import java.util.*;

public class Sort {
    public static void main(String[] args) {
        List<Object> list = new ArrayList<Object>();
        list.add(1);
        list.add("Hi");

        Collections.sort(list);

        for (Object i : list) {
            System.out.println(i);
        }
    }
}

While the Java runtime allows us to create such a collection, the type system does not allow us to use Collections.sort to sort it, as Object does not provide us enough information to make sure these elements properly can be compared to each other (which in fact, they cannot, as comparing strings to integers is not defined in Java’s “natural ordering”):

$ javac Sort.java
Sort.java:9: error: no suitable method found for sort(List<Object>)
        Collections.sort(list);
                   ^
    method Collections.<T#1>sort(List<T#1>) is not applicable
      (inference variable T#1 has incompatible bounds
        equality constraints: Object
        lower bounds: Comparable<? super T#1>)
    method Collections.<T#2>sort(List<T#2>,Comparator<? super T#2>) is not applicable
      (cannot infer type-variable(s) T#2
        (actual and formal argument lists differ in length))
  where T#1,T#2 are type-variables:
    T#1 extends Comparable<? super T#1> declared in method <T#1>sort(List<T#1>)
    T#2 extends Object declared in method <T#2>sort(List<T#2>,Comparator<? super T#2>)
1 error
$

So how does this work? What is a Java interface? What are its advantages or disadvantages?

Well, Java has two types of values: primitives on the one hand, and object references on the other. In order to use interfaces, or polymorphism at all, we must be dealing with objects. For primitives, there are separate methods for sorting various types of arrays in the Arrays class. As primitives cannot be stored directly in collections, Collections doesn’t have to deal with them.

So, to use this polymorphism through interfaces, we must be dealing with objects. Objects in Java are a rich, standardized data structure, which is why it’s possible to query at run-time which interfaces an object supports. Objects contain not just the fields that the Java programmer specifies, but additional metadata that includes implementations of any supported interfaces, including Comparable. That metadata can be used to find the right version of the compareTo method to use to sort objects of type T. Once we have a T, we can query it at run-time to find the compareTo method. Theoretically, Java might query every object separately as it sorts, with a separate query for each comparison, although I trust that modern Java will in many cases realize that the method will be the same for each object, and figure out a way to optimize it out.

As a programmer of a type, we simply declare at the top of the class that our type Foo, for example, implements Comparable<Foo>, and then lower down include our implementation of compareTo among our methods with the override keyword. Based on that, Foo objects will be created with the correct metadata such that Java will know to use that method for comparison when sorting, whether the type is known at compile-time or at run-time. We can implement our own version of compareTo that has a different type than the typical “natural ordering” one would expect from the state that is contained in a Foo:

import java.util.*;

public class Sort {
    private static class Foo implements Comparable<Foo> {
        int inner;

        public Foo(int inner) {
            this.inner = inner;
        }

        @Override public int compareTo(Foo foo) {
            // Less and greater are swapped by this compared to int
            // comparison
            if (foo.inner > this.inner) {
                return 1;
            } else if (foo.inner < this.inner) {
                return -1;
            } else {
                return 0;
            }
        }

        public String toString() {
            return "" + inner;
        }
    }

    public static void main(String[] args) {
        List<Foo> list = new ArrayList<Foo>();
        list.add(new Foo(3));
        list.add(new Foo(4));
        list.add(new Foo(1));
        list.add(new Foo(2));

        Collections.sort(list);

        for (Object i : list) {
            System.out.println(i);
        }
    }
}

Here is the output:

$ java Sort
4
3
2
1
$

Built-in types such as String and Integer already provide their own compareTo override methods, corresponding to more typical implementations of comparisons. Only the author of each type can provide information on how the types are to be compared in this way. To get around this, you can use a wrapper type for each element (like Foo), or you have to fall back on passing in the comparison function the old-fashioned way, like in qsort – though in Java passing in a function is accomplished here through yet another interface, Comparator, as in this alternative function:

public static <T> void sort(List<T> list,
                            Comparator<? super T> c)

Here, Comparator is effectively a function pointer with context, but it’s expressed as an interface so that you can write a concrete class that implements the desired function. Fundamentally, Rust and C++ do something similar.

So, how are we to evaluate this system? It’s not particularly designed for situations like sorting. The run-time system is built for the heterogeneous containers, where each individual element of a collection might have a different opinion on how to compare itself to the others. The amount of run-time flexibility is overkill to the situation.

Rather than providing one sorting function pointer, as in the C example, each object comes with its own infrastructure for finding out how to not only sort, but do every other thing that Java might want to do polymorphically with that object, such as convert it to a string, or hash. While the infrastructure is well-optimized and performant for the assumption of heavy use of OOP-style polymorphism, it clearly doesn’t hold to the C++ or Rust performance ideals of not paying for what you don’t use, instead opting to pay an up-front cost under the assumption that any and all objects will regularly be used polymorphically, in OOP style.

The type system in Java is conceptualized as a way of preventing errors, a layer of safety on top of a more Smalltalk-like natural OOP state. In Smalltalk any method can be invoked on any object, and it’s simply a run-time error if that method isn’t available. In Java, the types form a more rigorous layer to check to make sure our method calls have correct semantics, allowing errors to be caught earlier, at compile-time (although Java type errors are also sometimes caught at run-time). The power of the more ideologically pure form of OOP is still available in Java, as evidenced by the signature on the Arrays.sort method alluded to above (and documented here. It is deprecated, but still possible:

public static void sort(Object[] a)

Here is a use case that succeeds:

import java.util.*;

public class Sort {
    public static void main(String[] args) {
        Arrays.sort(args);
        for (String arg : args) {
            System.out.println(arg);
        }
    }
}

Here is the output:

$ java Sort a c b
a
b
c
$

Here is a use that fails:

import java.util.*;

public class Sort {
    public static void main(String[] args) {
        Object [] array = new Object[2];
        array[0] = new Integer(0);
        array[1] = "Hi";
        Arrays.sort(array);
        for (Object obj : array) {
            System.out.println(obj);
        }
    }
}

It outputs:

Exception in thread "main" java.lang.ClassCastException: class java.lang.Integer cannot be cast to class java.lang.String (java.lang.Integer and java.lang.String are in module java.base of loader 'bootstrap')
	at java.base/java.lang.String.compareTo(String.java:125)
	at java.base/java.util.ComparableTimSort.countRunAndMakeAscending(ComparableTimSort.java:320)
	at java.base/java.util.ComparableTimSort.sort(ComparableTimSort.java:188)
	at java.base/java.util.Arrays.sort(Arrays.java:1249)
	at Sort.main(Sort.java:8)

The cost of this is acceptable in Java but not in Rust or C++, or C for that matter. Every object must contain individual metadata if it is to be sortable through a polymorphic function, and it must be boxed. In C++ or Rust, we must be able to sort arbitrary unboxed data, without extra metadata included directly within it. But in Java, all types except for primitives are boxed, only boxed types support polymorphism, and they do so at the cost of additional data in each heap allocation to do so. And it works, for Java’s goals, of being a garbage-collected OOP language with a layer of types to expose errors at compile-time.

As the C example shows, this cost isn’t intrinsic to run-time polymorphism in general, but it is intrinsic to OOP-style polymorphism. OOP uses run-time polymorphism at an individual object level as one of its core features, even when the function does not need to be conveyed on a per-element basis, but only once.

Programming Language #2: Sorting in C++#

C++, of course, supports this type of run-time polymorphism. We could, if we wanted, build a system like Java’s, where we had an abstract class Comparable that we could use to add run-time data to show every object of a type how to be compared with every other object. We could require that collections to be sorted contain classes that inherit from – in C++, inheritance and interface implementation are the same – Comparable. C++’s run-time polymorphism could be used to implement sorting in the exact same way as Java.

But that’s not how sorting is implemented in C++. Sorting, in C++, uses a completely unrelated mechanism of templates. Templates are C++’s mechanism for static, compile-time polymorphism, just as virtual functions and inheritance are C++’s mechanism for dynamic, run-time polymorphism (of a classical OOP variety that closely resembles Java). In spite of them both being forms of polymorphism, and having many overlapping use cases, templates and virtual functions are completely unrelated features.

I have seen people argue that templates and virtual functions are justified in being completely unrelated, because every situation clearly calls for one or the other. But if it’s possible to do sorting with run-time polymorphism, as we see from Java, then clearly the distinction is not clear-cut as all that. What’s to stop a former Java programmer from using C++’s run-time polymorphism to implement their own sorting function a la Java, even though that’s not idiomatic C++? There’s clearly some level of overlap in use cases, even if not in semantics!

So, how do templates actually work?

Caveat for modern C++ fans: I’m going to save concepts for the end. They don’t actually substantially affect my point (as I will explain). I think it’s simpler to talk about pre-concepts C++ at first, and then discuss how concepts impact (or rather, don’t really impact) the equation.

Templates are a form of macro system. A template (class template, function template, type alias template, etc.) is given parameters at compile-time. Once the template is given parameters, it is instantiated and stamps out a concrete component of the program (a class, function, type alias, etc.).

So, that’s quite abstract. This is a situation where an example can help a lot. In line with our theme, we’re going to write a template that involves comparisons: given two values of any type that you can compare (and we’ll have to decide what that means), which is bigger?

template <typename T>
T max_value(T a, T b) {
    if (a < b) {
        return b;
    } else {
        return a;
    }
}

When we actually invoke it, we provide a type for T, giving us a specialized function where T is replaced by that type.

std::cout << max_value<int>(3, 4) << std::endl;
std::cout << max_value<std::string>("hi"s, "hello"s) << std::endl;

The mere mention of max_value<int> creates a function max_value<int>, and likewise for max_value<std::string>. This function is the template, with the template parameter in brackets standing in for T.

Of course, for function templates, specifying the T is optional, as C++ can infer it, so this code works equally well:

std::cout << max_value<int>(3, 4) << std::endl;
std::cout << max_value<std::string>("hi"s, "hello"s) << std::endl;

So, what are the resulting functions? It’s very similar to as if we had written:

int max_value(int a, int b) {
    if (a < b) {
        return b;
    } else {
        return a;
    }
}

std::string max_value(std::string a, std::string b) {
    if (a < b) {
        return b;
    } else {
        return a;
    }
}

These are separate functions. The compiler will simply generate as many separate versions of max_value as it needs to. It outputs separate assembly language for each of them, and treats them as function overloads, meaning that it uses the static (compile-time) type of the parameters to figure out which function to call.

So, from the perspective of someone reading the code, we call max_value twice, and it figures out how to do its thing on an int or a std::string. It’s polymorphic, as it does the same algorithm (finding max) with an operation that changes based on type (<). But from the perspective of someone reading the outputted assembly, it’s not polymorphic – we’ve simply got two different functions that do max_value in two different ways.

In other words, we’ve gone from polymorphic code (compile time) to monomorphic code (run time). This is why Rust calls its equivalent to template instantiation “monomorphization.” This is also why it’s called “compile time polymorphism” – it is no longer polymorphic at run-time.

The advantage: This is a zero-overhead abstraction. We’re having the compiler write, on our behalf, specialized code for each type. We do not need each element to have virtual function metadata to indicate how to do comparisons, nor do we even need a function pointer like with qsort. It’s as optimal as specialized hand-written code, but we didn’t have to do the specialization.

The disadvantage: We have to know the type at compile-time. This prevents heterogeneous containers from being possible with this style of polymorphism. This type of polymorphism can only be based off of the compile-time type, not based off of changing run-time types. It is the exact opposite of “late binding” – the binding is done at compile-time. So, this could not be used for polymorphism over different types of widgets in a list of widgets.

The other disadvantage: Compile times take longer and the resultant binary is larger. (Eh, shrug.)

So what operations are needed to support this template? What definition are we using for “comparable type” for T? We’re not explicitly using any at all, but note that if the type T doesn’t support the < operator, this code will simply fail to compile:

class Foo {
};

max_value(Foo{}, Foo{});

Giving the error:

test.cpp: In instantiation of ‘T max_value(T, T) [with T = main()::Foo]’:
test.cpp:22:14:   required from here
test.cpp:8:11: error: no match for ‘operator<’ (operand types are ‘main()::Foo’ and ‘main()::Foo’)
    8 |     if (a < b) {
      |         ~~^~~

This goes away if we give it the < operator.

class Foo {
public:
    bool operator <(const Foo &other) const {
        return false; // All Foos are created equal!
    }
}

max_value(Foo{}, Foo{}); // Now compiles

If we’d written max_value differently, however, using > instead, this might not have made the error message go away. It turns out that < is the conventional operator to use for comparisons, however, the C++ equivalent to Java’s Comparable, the defining function for “natural order” by convention.

Is that all that’s required to make max_value work? It turns out no, as many an astute C++ programmer has probably already noticed. There is another operation besides operator< required to make max_value work, and this is because I intentionally made a mistake (so I could reveal it later to show how subtle templates can be).

Let’s take a look at the instantiation for std::string again, just the signature:

std::string max_value(std::string a, std::string b);

Is that how we’d write max_value by hand for std::string? No, we wouldn’t. We’d write const std::string &a, and take it by reference, so that no new objects are initialized in the comparison and return. If you’re not a C++ programmer, this might seem shocking, but max_value as we wrote it requires the type to be passable by value, which is a capability that a type might not have:

class Foo {
public:
    Foo() = default;
    Foo(const Foo&) = delete;
    bool operator <(const Foo& other) const {
        return false; // All Foos are created equal
    }
};

max_value(Foo{}, Foo{}); // Error! Error!

So, we missed the mark, quite by accident! We had an extra requirement besides comparison, and we can fix that by taking the value by (const) reference (which is what std::max does anyway), which also implies returning by reference:

template <typename T>
const T &max_value(const T &a, const T &b) {
    if (a < b) {
        return b;
    } else {
        return a;
    }
}

So what was required from T for us to call max_value?

In one sense, nothing besides that it should be a type! We could pass any type in for T, and the compiler will plug in the type and chug away, running into errors only once it has attempted to do so! This might actually happen several template instantiations deep, and the resulting error shows up in the template where the operation is attempted, not in where you use the template with an inappropriate type, which can be confusing.

In another sense, what is required is that we pass types that make max_value compile, so in this case, ones that support operator <. However, there is no guarantee or check that the type is making the semantic promises that correspond to that type. Sorting, for example, requires that that operator work in such a way as to define a strict equivalence class. If that operator doesn’t in fact do that, std::sort will compile but won’t work properly.

It seems reasonable in this case to expect people to use operator < for less-than as it’s such a well-established and fundamental operator. But templates can also invoke named methods. What if somebody writes a template that calls some_t.foo() expecting it to do one thing, and someone calls that template with an unrelated class that has a type-compatible foo method, but with different semantics? There is no indication to the compiler, when you write the class, that you intend for foo to be appropriate for use in the template. We didn’t have to say, when we wrote Foo here, that our operator < was valid for std::sort.

Concepts do help with that. You can statically assert that a class supports a concept’s requirements, and that documents your intention to support it semantically as well. Concepts can also cover stricter requirements than a template incidentally imposes, and help document the semantics of templates.

But everything about concepts is opt-in; you can always write a template that will sometimes fail on instantiation. And that makes them much less useful in my book. Don’t get me wrong: I’m glad they exist. I think C++ with concepts is better than C++ without concepts. But it only goes so far, especially when compared with Rust traits, which are mandatory for Rust’s form of compile-time polymorphism.

More relevant than all of this, to me, is that templates and OOP work so differently than each other. Run-time polymorphism and compile-time polymorphism are just completely different beasts. Students are taught the OOP style run-time polymorphism, and that doesn’t really help them understand templates, or even get started doing so. Again, I feel C++ is too big.

But, at least it has this zero-overhead abstraction, without requiring a method look-up and an indirection for every item to be sorted.

std::sort, by the way, takes iterators. These iterators must be value swappable legacy random access iterators, and that’s just a subset of the requirements, as seen in std::sort’s CPPReference page. The way to get from one element to another (and therefore implicitly the size), the way to swap elements, and the way to compare them are all implicitly derived from RandomIt, the type parameter specifying the type of the iterator (at least in the overloads of std::sort that do not take an explicit comparator).

Programming Language #3: Sorting in Haskell#

Now for Haskell!

We’re mostly talking about Haskell to move on to talking about Rust, as this is a Rust-focused blog. There’s a lot going on with Haskell typeclasses that I won’t have time to get into here.

Haskell is where Rust got traits from, although Haskell calls them typeclasses. Incidentally, Haskell uses run-time polymorphism where Rust uses compile-time polymorphism, but the semantics are more similar than you might expect from that statement.

In Haskell, like Java, all types that sort accepts are boxed, covering size and swapping among the traits that might need to be customized. Unlike Java, the operations we need to perform on values of this type are passed to sort once, rather than looked up on a per-element basis.

Here is the type for sort:

sort :: Ord a => [a] -> [a]

a here is like T in C++: a type variable that can be replaced with any type. As in Java, this is subject to type erasure: sort just operates on generic boxed values. Any comparison-specific operations it needs come from the Ord a =>, which constrains a to types that have instances of the Ord typeclass.

Here is the definition of Ord:

class  (Eq a) => Ord a  where
    compare              :: a -> a -> Ordering
    (<), (<=), (>), (>=) :: a -> a -> Bool
    max, min             :: a -> a -> a

    compare x y = if x == y then EQ
                  -- NB: must be '<=' not '<' to validate the
                  -- above claim about the minimal things that
                  -- can be defined for an instance of Ord:
                  else if x <= y then LT
                  else GT

    x <  y = case compare x y of { LT -> True;  _ -> False }
    x <= y = case compare x y of { GT -> False; _ -> True }
    x >  y = case compare x y of { GT -> True;  _ -> False }
    x >= y = case compare x y of { LT -> False; _ -> True }

        -- These two default methods use '<=' rather than 'compare'
        -- because the latter is often more expensive
    max x y = if x <= y then y else x
    min x y = if x <= y then x else y

It defines many methods that an instance of Ord can support. These methods are functions defined in terms of each other; you must specifically implement at least one of them for your type to prevent infinite regress. Minimally, either compare or <= is sufficient, with compare recommended for more complex types.

Unlike in C++, when you define these methods, it is not enough to simply define a function called <= or compare. Haskell won’t even let you define functions with the same fully qualified name as the methods, which exist in the same namespace as any other functions. Unlike C++, Haskell does not have function overloading, and any time the same fully qualified name has different semantics for different types, it is through this mechanism of typeclasses. Like in Java, you have to explicitly declare your intention to implement the methods as found in Ord, by writing an instance explicitly, like so:

import Data.Ord
import Data.List

data Foo = Foo Integer
    deriving Show

instance Eq Foo where
    (Foo a) == (Foo b) = a == b

instance Ord Foo where
    (Foo a) <= (Foo b) = b <= a

main = do
    let list = [Foo 3, Foo 4, Foo 2]
    print $ sort list                   -- outputs [Foo 4,Foo 3,Foo 2]

Note that the instance declarations are separate from the definition of the type! The module where the type is declared can define them, but so can the module where the typeclass is declared. Other modules are not allowed to by default to make sure there is only one canonical definition of an instance for a given type and typeclass.

How does this actually work then? Well, Ord a is a secret parameter to sort. Haskell will create a bundle of function pointers for us that represent the specific Ord instance for whatever type we pass to sort, either from knowing the type statically at that point, or passing along a bundle passed into whatever called sort. So this compiles to something quite similar to the C qsort (at least as far as polymorphism is concerned), taking in a comparison function. The big difference is, Haskell will choose the comparison function for us – but it is one comparison function, not one comparison function per item as in Java.

Programming Language #4: Sorting in Rust#

So, how does Rust do all of this?

As I said, a Rust trait is very much like a Haskell typeclass. Rust’s main sort method, like Haskell, requires the Ord typeclass trait. Like Haskell, it even has provided (but overrideable) methods as well as required methods:

pub trait Ord: Eq + PartialOrd {
    // Required method
    fn cmp(&self, other: &Self) -> Ordering;

    // Provided methods
    fn max(self, other: Self) -> Self
       where Self: Sized { ... }
    fn min(self, other: Self) -> Self
       where Self: Sized { ... }
    fn clamp(self, min: Self, max: Self) -> Self
       where Self: Sized + PartialOrd { ... }
}

Like typeclasses, to indicate that a type has a trait requires a specific block that says what trait we’re trying to implement, and lists the implementation of the required methods. Like in Haskell, that block may reside in the crate where the trait is defined, or the trait where the type is defined. Like in Haskell, this allows us to add polymorphism to previously unpolymorphic operations without having to create wrapper types.

Here is an example of implementing this trait (unfortunately, we have to implement both Ord and PartialOrd):

use std::cmp::Ordering;

#[derive(PartialEq, Eq, Clone, Copy, Debug)]
struct Foo(u32);

impl PartialOrd for Foo {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        other.0.partial_cmp(&self.0)
    }
}

impl Ord for Foo {
    fn cmp(&self, other: &Self) -> Ordering {
        other.0.cmp(&self.0)
    }
}

fn main() {
    let mut foos = vec![Foo(3), Foo(4), Foo(1), Foo(2)];
    foos.sort();
    println!("{:?}", foos); // Displays [Foo(4), Foo(3), Foo(2), Foo(1)]
}

It’s very similar to Haskell, but with “C-like” syntax and aesthetic. The syntax for the functions using the trait looks like C++ templates:

fn max<T: Ord>(a: T, b: T) -> T {
    if b > a { b } else { a }
}

What’s different from Haskell is how it’s implemented. The semantics are quite similar, and the Rust implementation can be thought of as an optimization of the Haskell semantics. Instead of passing in to sort() a secret run-time parameter with Foo’s implementation of Ord, the function is monomorphized. We can think of it as inlining just that one parameter at compile-time, and generating a specialized function.

Yes, this implementation is fundamentally very similar to C++’s implementation of templates. It’s basically the same in terms of machine code and resulting optimizations. But the semantics are more Haskell-like. Polymorphic functions are type-checked once. They may only use functionality incorporated in the traits at hand. We don’t postpone the type-checking for the template instantiation.

What’s more, the same mechanism is also used for Rust’s run-time polymorphism, where we can have a type like dyn MyTrait for some specific traits that are object-safe. These trait object types are like OOP polymorphic types, in that each value has its own copy of the table of polymorphic functions with it, but the copy is outside the original object. It is a property of the pointer, not of the object, and implemented with fat pointers.

Like with any other trait, the trait implementation is separate from the type definition or the trait definition (though it must live in the same crate as one of them). Unlike C++, there is one system for polymorphism that can be used in both run-time and compile-time ways, with overlap where possible.

Conclusion#

I hope this shows, if nothing else, that polymorphism itself can take many forms in many programming languages beyond the OOP variety of it. The OOP variety is in some senses self-propagating – if you optimize your language for it as in Java, then it makes sense to use for everything, even if it’s not what you would choose in a language that has other options.

For many forms of polymorphism, in C++ (for templates), Haskell, and Rust, no inheritance is necessary. It is simply not built according to the OOP frame of mind. I personally think Haskell and Rust are doing it right here, as is perhaps obvious from how I’ve written about it.

I hope to write more about run-time polymorphism in Rust, and how it differs from the C++ variety, and how you can manually implement other types of run-time polymorphism if you want. This would be a future post. But, this is a hobby blog, so no promises on timeline!