Rust is a polarizing programming language, because of how radical it is. It has gone the furthest in introducing features from functional programming languages into the mainstream world, and ignoring long-held programming language design principles from the realm of object-oriented programming. Its fans can be very enthusiastic, sometimes off-puttingly so, stereotypically demanding that all software be rewritten in Rust even when completely unfeasible -- a stereotype that is mostly untrue, but whose existence and occasional true examples shows the intensity of the debate. But a lot of Rust's criticism comes specifically from C++ programmers, and correspondingly a lot of Rustaceans' criticisms of other programming languages is directed specifically at C++ (including of course in this book). Even the creator of C++, while not mentioning it by name, entered the fray (and along with other Rustaceans, I responded, and my response is included in the appendices).

There's a good reason for this particular rivalry. While usable in other domains, Rust is strongest where C++ has hitherto been unopposed: as a high-level systems programming language. Many of Rust's greatest strengths are directly based off of ideas originated in C++, such as RAII. And Rust has, in many ways, the same goals that C++ has.

Specifically, in this book I shall argue that Rust has the exact same overall goal that C++ does, albeit with a different interpretation of how that goal is best accomplished. I will further argue that Rust does a better job of accomplishing these goals. Thus, the thesis of this book is a slightly longer version of the title:

Rust is a better C++ than C++, as it is better at C++'s own goals.

Zero-Overhead Abstractions

C++ has an explicit goal of providing zero-cost abstractions.

This is a bit of a confusing term of art and has the potential to be misleading, but it comes attached with explanations that clarify it some. It is also referred to as the "zero-overhead principle," which Dr. Bjarne Stroustrup, father of C++, explains (see pg. 4) describes as containing two components:

  • What you don’t use, you don’t pay for (and Dr. Stroustrup means "paying" in the sense of performance costs, e.g. in higher latency, slower throughput, or higher memory usage)
  • What you do use, you couldn’t hand code any better

There is also an executive summary of the concept at CppReference.com.

A clearer term that is occasionally used in the trenches in the C++ community is "zero-overhead abstraction" -- there is zero overhead, defined as cost in addition to what a reasonably-well hand-coded implementation would do. Using this term, a third principle becomes clearer, which was hidden all along, unstated among those other two principles, and against which those other two principles are balanced. The word "abstraction" is the key, and the third principle is:

  • You can still get the abstractive and expressive power you expect from a modern programming language.

This third principle is necessary to distinguish higher-level "zero cost" languages like C++ and Rust from lower-cost languages like C.

To fully explain why I include this third principle, and to delve into the history of the concept in general, I want to talk more about C.

C: The Portable Assembly

C has often been described as a "portable assembly language." Unlike other high level programming languages before it ("high level" at the time meaning anything higher level than raw assembly language), it exposed users directly to gnarly machine-language abstractions like pointers, and to common assembly-language capabilities like shifting and bitwise operators.

The goal was to give the programmer something minimally distinct from assembly language, where the programmer had almost as much control over the computer as an assembly language programmer without sacrificing portability. Few higher-level features have been added, even now: there was no built-in string type, and only a limited array type that exposed the underlying concept of pointers the instant you poked at it. Structures are little more than a way of calculating offsets, and memory management is done by explicitly invoking memory management routines.

C's preference, in general, was to only add onto assembly those features absolutely necessary for portability, and not to impose any other structure on the programmer -- or, said another way, not to provide any other structure to the programmer.

This was far from an iron-clad rule. And there are definitely exceptions: C, built into the programming language, prefers null-terminated strings (also known as "C strings") to arrangements that use specific lengths, a substantial constraint on the programmer beyond assembly language and probably a mistake overall.

More deeply, and probably less avoidably at the time, C assumes a traditional call structure. Many techniques that can be used to implement closures, co-routines, or other more radical alternatives to a call stack are difficult to impossible to do with standard C -- while generally being possible in any assembly language.

But, with these exceptions, C generally does tend to only provide one overarching abstraction, portability, and when it does, it has the same zero-cost goals that C++ has, to only make the user pay for the abstractions they actually use, and to provide abstractions as efficiently as the equivalent hand-coded assembly.

Put another way, C++'s zero-cost overhead principle, as Dr. Stroustrup defines it, is more or less inherited from C. Where C++ differs from C is in the "abstraction" part of providing "zero-cost abstractions." Everything you can do in C++ you can do in (potentially tedious and repetitive and error-prone) C, but C++ provides more abstractions, beyond just what is necessary for portability.

And C does a great job of portability! But missing from that goal is anything about a generally usable set of abstractions. Some abstraction over assembly language is necessary to ensure portability, but C doesn't really go beyond that. It provides a standard library, but again, that's for portability purposes. The C standard library is not much more powerful than an assembly-language operating system API, just more standardized. C is portable assembly, not abstracted assembly.

C++: A More Abstracted C

C++ goes beyond that. C++ tries to be competitive. Before, we dissected "zero-overhead abstractions" into three goals:

  • What you don’t use, you don’t pay for
  • What you do use, you couldn’t hand code any better
  • We give you the power of abstraction expected for a programming language of the day

But really, they are one goal. The essential goal of C++ over C is this:

  • Create a competitive set of abstractions that cost no more than manual implementation in C or assembly, whether in terms of speed or memory or any other resource.

This includes the entire principle of zero-overhead abstraction. I include the word "competitive," to describe how C++ in fact approaches what abstractions it offers, and how C does not: In competition with other programming languages, which directly implies our third point above, about abstractions expected for a modern programming language.

This general single goal also implies the second point, that what you don't use you shouldn't pay for. If a feature is not used, a C programmer would simply not implement it and pay no cost for it. So if the C++ programmer pays any cost for it at all when they don't use it, that cost is all overhead.

Now we have C++ down to one coherent, easily-stated goal. And once we understand this, everything else about C++ makes sense.

C++ was originally christened "C with Classes," and it tried to add Object-Oriented Programming to C. All the mechanisms of OOP could be portably added to C directly by an application or library developer with judicious use of function pointers and structure nesting (and glib is a famous example of a library that does exactly that), but C++ built this abstraction into the programming language itself.

Objective-C also did this (and according to Wikipedia it "first appeared" one year sooner in 1984), but Objective-C has always felt like two programming languages glued together. In Objective-C, the object-oriented features do not inherit the zero-overhead principle from C -- nor do they look like C at all. They look instead like a Smalltalk dialect, where switching between C and this odd Smalltalk dialect was permitted on an expression-by-expression basis using an odd mix of square brackets and @-signs.

In C++, the added abstractions, including OOP, take on more of a resemblance to C, and importantly, continue to try to retain C's advantages in systems programming by making the new features zero-overhead.

During much of the history of C++, OOP was considered to be the most important abstraction that a programming language could offer. But once it was added, it expanded the scope of C++ abstractions. Nowadays, C++ is considered multi-paradigm, and provides not just OOP, but a wide array of abstraction.

The only features C++ rejects out of hand are those that do not jive with zero-cost abstraction. This is why garbage-collection is not offered in C++ (though it is still possible to implement manually) -- it cannot be offered in a zero-cost way. However, C++'s alternative to garbage collection, namely RAII, continues to become more effective as new features like move semantics and std::unique_ptr were added, to the extent that in modern C++, it would be unimaginable not to have those features, and they have become essential to C++'s memory management model.

This also explains why C++ keeps accruing new features -- to have a competitive set of abstractions -- whereas C maintains the features it has -- because the only thing it needs abstractions for is portability. This explains why C++ had to add templates -- as a zero-cost alternative to OOP, or a zero-cost way of implementing collections. They explain why C++ had to add move semantics -- because without it, RAII is a worse abstraction than GC.

These are great features that C++ has had to develop to achieve these goals.

RAII is a genius solution to the problem that garbage collection is not zero-overhead. With RAII, the source code looks simpler than in C, and is harder to get wrong. It's not quite as straight-forward as in a garbage-collected language, but it has many of the benefits of abstraction: You can't just forget to call the destructor.

But! From the compiled binary, modulo such nit-picky accidents as symbol names, you wouldn't be able to tell it was written in C++ instead of C! The abstraction was resolved and disappeared at compile-time.

Similarly with templates, and therefore with the collections in the STL. Similarly with many other C++ features.

And Rust has greatly benefitted from all of this innovation in C++. C++ is one of the giants on whose shoulders Rust stands. Rust inherits RAII from C++. Rust inherits the core idea of templates as well, though it puts some constraints on them and calls them "monomorphization."

And Rust needs these, because Rust is also striving to be the kind of programming language where the compiled binary looks like something someone could have written in C, but where the programmer actually had a much easier task. And I will argue that Rust does a better job.

Wrinkles: OOP and Safety

There's a few wrinkles in this though, a few featuers in either programming language that seem to detract from this goal, to undermine the idea that this is truly a focus for them at all. On the C++ side, we have OOP and virtual methods, which are often less performant than the equivalent hand-written C code would have been. On the Rust side, we have safety: array indexes, by default, panic when the index is out of bounds. What is "safety," and can Rust really be said to be interested in minimizing the cost of abstractions if it's also trying to achieve safety?

I know these seem like wildly unrelated issues, but they're actually connected. Both C++ and Rust are trying to have their cake and eat it too. They're trying to provide all of the conveniences of a modern high-level programming language, while outputting binaries equivalent to those a C programmer would make.

But safety and OOP actually do have non-zero overhead. OOP has virtual functions, preventing inlining and other optimizations and requiring indirect function calls. Safety, for its part, requires bounds checking, an obvious non-zero overhead. And in Rust, it constrains heap usage to certain layouts that are often less efficient than the ideal layout would be.

So why do C++ and Rust make this decision? And how do they justify it?

For most of C++'s history, OOP was an essential convenience of a high-level programming language. Everyone's buzzwordy design patterns were conceptualized in OOP. Without OOP, a programming language could not at all be taken seriously at the time, as it was widely believed that scalable software architecture and intuitive reasoning about code required OOP.

And throughout much of this time as well, C programmers were creating similar code to this, so it was actually like code a C programmer would write by hand! It was very popular to write structs of function pointers, or other complex mechanism to allow "object oriented design" into C programs. This is the entire premise of GObject, now part of glib and used by gtk.

Nevertheless, OOP has run-time costs. Run-time polymorphism (known in C++ as "virtual functions") is one of the pillars of OOP, the basis of software decision-making, and a powerful abstraction. In most OOP programming languages, this meant (and still means) there is a (non-zero cost) run-time decision made at every method call.

C++ maintains its "zero-overhead abstraction" principle here on a technicality: by making it optional. Rather than not including the feature, C++ makes you only pay for it if you use it. C++ was considered to be low-level by making virtual an opt-in keyword, rather than the default. As we have said, not having it at all would've utterly disqualified C++ as an application programming language. It was considered necessary for usable abstractions.

After all, in programs where C++ programmers gets to just write virtual, C programmers were implementing all of OOP, including runtime polymorphism, by hand.

Rust, however, starts over trying to accomplish this goal from a much more recent time. Now, OOP is no longer a sine qua non of programming languages -- in fact, it's become old-fashioned. Rust uses features from the functional programming paradigm instead to manage abstractions and give users a chance at wrangling a large codebase. In doing so, Rust has largely moved beyond the need for OOP.

However, just like OOP was considered essential to be taken seriously when C++ was in its heyday, nowadays, new programming languages are expected to be memory safe. Memory safety is no longer a feature for "novel" programming languages to experiment with, as Dr. Stroustrup, father of C++, unfortunately still thinks of it.

Memory safety comes with performance costs. Until Rust came along, it was widely assumed that it required garbage collection, an unacceptable cost, one that is impossible to opt out of and still use a heap, one that generally is paid not as-used but on a per-program basis. Rust, however, has managed to figure out a way to extend C++'s RAII and move semantics with the borrow checker and lifetimes (from Cyclone's "regions"), and make manual memory management with RAII safe.

There are still performance costs, but they are paid on an as-needed basis, and opt-out is still available.Unlike a Java or a Haskell, the goal isn't so much being a memory-safe programming language as having a memory safe subset and encouraging memory-safe abstractions around unsafe code. Similarly to how C++ is distinct from other OOP languages by making virtual functions optional, Rust is distinct by having unsafe and raw pointers at all. If there was no unsafe keyword to guard these features and protect programmers from using them by accident, it would've disqualified Rust, in 2015, from being an application programming language at all.

Protections for safety are now considered necessary for usable abstractions. I would say that C++ gets away with it because of its venerable, established position, but in fact it does not get away with it. No one uses C++ for new application-level programming, but rather only for systems programming where the performance is absolutely necessary.

This is because of safety. Many C++ programmers, including Dr. Stroustrup himself, don't realize that the rest of the world has already moved on to safe programming languages and it's no longer considered a novelty. Often, they think systems programming is the world rather than just a niche. But make no mistake, now that Rust has brought safe programming to this niche, it is only a matter of time.

But it still satisfies the zero-abstraction principle, even though there is overhead. The overhead only applies when the feature is being used. If there is a line of code where the overhead is unacceptable, the solution is simple: use the unsafe keyword, and use a non-bounds checked method (or an alternative data structure with raw pointers). It is advised that you wrap this unsafe code in a safe abstraction, but that is just advice. In the end, Rust only makes you pay for abstractions you're actually using, just like C++.

Should I use C++ or Rust?

So C++ and Rust both share the same essential goals:

  • The C goal: Be portable while exposing the full power of assembly language.
  • The C++ goal, which implies the C goal: Have modern, high-level programming language features while still outputting code as good as what an assembly language programmer would write.

So if these goals appeal to you as a software developer, which programming language should you use, Rust or C++?

In my mind, that depends then either on the non-essential goals, or else just the accidents of history and ecosystem.

If you have a large codebase already in C++ for example, then that might mitigate against switching. We can attribute this to C++'s goal of being compatible with previous versions of C++, a goal C++ has paid much for. Similarly if there's a C++ library that turns out to be the perfect match, that may make your decision for you.

But to be clear, it's similar in the other direction if you already have a large Rust codebase -- there's just fewer people in that position. This will probably change over time, though. I think Rust's ecosystem is already competitive with C++.

I think discussing the goals is more interesting, however, especially in the long term.

If you need object-oriented programming, which is another goal of C++, then C++ might be your thing. I generally think object-oriented programming is overrated and Rust's way of handling abstraction to be both more powerful and less prone to problems, but many people disagree with me. And I must admit: the big use case everyone always mentions for OOP is GUI programming, and Rust's ecosystem is particularly behind in the GUI space.

However, if you're worried about memory corruption and the related security vulnerabilities, it might be nice to have a guarantee that only certain lines of code can cause such problems. It might be nice to have all those lines marked with a special keyword and conventionally scrutinized and abstracted in such a way as to help prevent these conditions.

And Rust's safety advantages go beyond simply delineating which features are safe and which are unsafe. Rust is able to accomplish much more in its safe subset without performance degradation than the average C++ programmer might guess, because it has a more sophisticated type system.

For a system's programming language, I think memory safety is more important than object-oriented programming and better GUI frameworks (for now). GUI apps, a long time ago, used to be written locally in C or C++ to run directly on the user's computer. Nowadays, they are more likely to be deployed over the web and written in Javascript, and even those apps that do run directly on the user's computer tend to be written in other, less systems-oriented programming languages. If you're writing a GUI app, the choice for the part of the app that the user interacts with isn't between Rust and C++; it's between Rust, C++, C#, Java, JavaScript, and many others. Neither Rust or C++ stand much of a chance in the GUI space long-term.

And over time, I suspect we'll find out that OOP isn't as necessary to GUI frameworks as we had thought. My favorite GUI framework personally is in Haskell and doesn't use OOP at all. And once that happens, I think OOP will simply be another legacy feature, as Rust's trait mechanism is superior to OOP for non-GUI contexts.

Memory safety, on the other hand, is key for systems programs. Servers, OS kernels, real-time systems, automobile controllers, cryptocurrency wallets -- the domains where systems programming tends to be used are also domains in which security vulnerabilities are absolutely unacceptable. The fact that C++ doesn't have a safe subset, and makes it so difficult to reason about undefined behavior compared to idiomatic Rust, is a serious problem.

But even if Rust didn't have a specific memory safety advantage over C++, it would still have quite a few things going for it. It avoids header files and all the concomittant confusion. It gets rid of null pointers in most contexts, called "the billion dollar mistake" by the inventor of null pointers himself. It tidies up compile-time polymorphism and brings it in line with run-time polymorphism. It has actual destructive moves.

In general, Rust takes advantage of the fact that it doesn't have to also be C and also be old versions of C++, and uses it to create a much cleaner experience.

Rust Deficits

Rust has a few other specific downsides compared to C++.

Interfacing with C is an important goal for reasons besides backwards-compatibility. On many platforms, C serves as a lowest-common-denominator programming language, and its ABI serves as an inter-language protocol. C++ does provide smoother interfacing with this protocol than Rust does.

Relatedly, C++ generally has a relatively stable ABI on a given platform for a given compiler vendor. This allows dynamic libraries to be used as plugins with minimal glue code, something that in Rust normally requires awkwardly working through a C ABI interface. Personally, I think machine-language plugins as dynamically loaded libraries are mostly a relic of past software distribution models, and haven't seen many situations where they make sense, but I could think of a few edge cases.

In both of these cases, Rust is clumsier, but not completely incapable. Rust still can speak the protocol that is the C ABI, just not as natively and smoothly-integrated as C++.

Other downsides of Rust have to do with network effects and Rust adoption. There is only one Rust compiler, while there are multiple C++ compilers, that work together through a standards process. GCC is currently in the process of getting Rust support, and we'll see how well that works out for Rust.

Similarly, there are a lot of libraries that exist in C++ that don't yet exist in Rust or have Rust bindings. Though that's true of any pair of programming languages, it is a specific reason some developers might still want to write new projects in C++ in favor of Rust.

Finally, while I still think Rust would be a better programming language than C++ even if unsafe code were allowed everywhere, I think Rust could do more to make its rules clearer in the unsafe realm. The fact that the latest research on Rust's memory models seems so deeply difficult to square with how async code often works as in this bug report makes me nervous.

I'm sure there are other ways in which Rust is behind C++, and the devil is as always in the details.

This Book

But enough about the exceptions! Every thesis worth writing a book about has caveats! Let's get back to the thesis of this book.

The groundwork for the thesis of this book is laid out above in this introduction. It is as follows:

Rust accomplishes the essential goals of C++ and keeps the good ideas, while eliminating the cruft, which has far-reaching benefits over remaining compatible with C++.

This book is all about the details and specific examples of this thesis. It covers a number of topics, from the detals of how it expands RAII to make it safe, how it makes move semantics less confusing, all the way to clarifications on what safety means in practice.

It originated as a collection of blog posts on my blog, The Coded Message, and it's an on-going project. New sections will continue to also be posted as blog posts, though revisions to the existing sections will take place sporadically with little fanfare. Please make issues and MRs on the git repo if you see any mistakes, or would otherwise like to contribute thoughts, criticisms, or additional content.

This book is more a persuasive document than an instructional document. It's a work of apologetics, explaining in detail, topic by topic, why Rust is good at these goals it shares with C++, and why a new programming language was necessary to achieve them more effectively.

Like most books of apologetics, it's nominally aimed at the skeptics, in this case the C++ developers who don't like Rust. But only nominally. It will be far more interesting for the seekers and the proselytes: Those who are interested in looking into Rust, or who have started using Rust, but aren't fully sure of its benefits over C++, or whether it can be truly used in as many ways as C++ can, or whether it can truly be as high-performance.

We've known for some time that C++ was hampered by its C legacy. Today, modern C++ is also hampered by the legacy of pre-modern C++.

Bjarne Stroustrup once famously said:

Within C++, there is a much smaller and cleaner language struggling to get out.

I'm sure Bjarne Stroustrup has already said that this quote was not about Rust, just as a long time ago he said that it wasn't about C# or Java. But I think it resonated with so many people because we all know how much cruft and complexity has accrued in C++. And the fact that the quote resonated so much I think says more about C++ than whatever Bjarne's original intentions were with these specific words.

And so, even though Bjarne explicitly said otherwise, I think Java and C# were efforts to extract this smaller and cleaner language -- one that had C++-style object-oriented programming without the other bits that they considered the "cruft." And for a substantial slice of C++ programmers, who didn't need control of memory and could afford garbage collection, this is exactly what the doctor ordered: Many use cases of C# an Java today would've previously been filled by C++.

Remember, C++ used to be a general-purpose programming language. Now, it's a niche systems programming languages. It has been edged out of other niches in large part by "C-like programming languages" that take what they like from it, and leave the rest, like Java and C#.

And I think Rust is finishing the job. It is a similar effort, but with a different opinion about which bits constitute the "cruft." Zero-cost abstraction remains a goal, but C compatibility does not. One of the goals of this book is to convince you that this is the right decision for a systems programming language.

Which brings me to this book's secondary thesis:

With a few notable exceptions, between Rust and C++, Rust is the better language to start a new project in. If you were going to write something new in C++, I think you should almost always use Rust instead, or else another programming language if it's outside of Rust/C++'s core niche.

This is a corollary of this book's primary thesis: if Rust has the same goals as C++, and accomplishes them just as well with fewer downsides and fewer costs, why would you use C++? In my opinion, the exceptions are already very limited, and will only decrease with time, until C++ is only appropriate for -- and ultimately only used for -- legacy projects.

Again, C++ is one of the giants on whose shoulders Rust stands. Without C++, Rust would've been impossible, just like Java and C# would've been impossible. But sometimes a clean, breaking re-design is required, and Rust provides exactly that.

RAII: GC without GC

I don't want you to think of me as a hater of C++. In spite of the fact that this book itself is a comparison between Rust and C++ in Rust's favor, I am very aware that Rust as it exists would never have been possible without C++. Like all new technology and science, Rust stands on the shoulders of giants, and many of those giants contributed to C++.

And this makes sense if you think about it. Rust and C++ have very similar goals. The C++ community has done a lot over all these years to pioneer new programming language features in line with those goals. C++ has then given these features years to mature in its humongous ecosystem. And because Rust also doesn't have to be compatible with C++, it can then steal those features without some of the caveats they come with in C++.

One of the biggest such features -- perhaps the biggest one -- is RAII, C++'s and now Rust's (somewhat oddly-named) scope-based feature for resource management. And while RAII is for managing all kinds of resources, its biggest use case is as part of a compile-time alternative to run-time garbage collection and reference counting.

As an alternative to garbage collection, RAII has deficits. While many allocations are created and freed neatly in line with variables coming in and out of scope, sometimes that's not possible. To fully compete with garbage collection and capture the diverse ways programs use the heap, RAII needs to be combined with other features.

And C++ has done a lot of this. C++ added move semantics in C++11, which Rust also has -- though cleaner in Rust because Rust was designed with them from the start and so it can pull off destructive moves. C++ also has opt-in reference counting, which, again, Rust also has.

But C++ still doesn't have lifetimes (Rust got that from Cyclone, which called them "regions"), nor the infamous borrow checker that goes along with them in Rust. And even though the borrow checker is perhaps the most hated part of Rust, in this post, I will argue that it brings Rust's RAII-centric compile-time memory management system much closer to feature-parity with run-time reference counting and other run-time garbage-collection technologies.

I will start by talking about the problem that RAII was originally designed to solve. Then, I will re-hash the basics of how RAII works, and work through memory usage patterns where RAII needs to be combined with these other features, especially the borrow checker. Finally, I will discuss the downsides of these memory management techniques, especially performance implications and handling of cyclic data structures.

But before I get into the weeds, I have some important caveats:

Caveat: No Turing-complete programming language can completely prevent memory leaks. Even in fully-GC'd languages, you can still leak memory by filling up a data structure with increasing amounts of unnecessary data. This can be done by accident, especially when sophisticated callback systems are combined with closures. This is out of the scope of this post, which only concerns memory management issues that automated GC can actually help with.

Caveat #2: Rust allows you to leak memory on purpose, even when a garbage collector would have reclaimed it. In extreme circumstances, the reference counting system can be abused to leak memory as well. This fact has been used in anti-Rust rhetoric to imply its memory safety system is somehow worthless.

For the purposes of this post, we assume a programmer who is trying to get actual work done and needs help not leaking memory or causing memory corruption, not an adversarial programmer trying to make the system leak on purpose.

Caveat #3: RAII is a terrible name. OBRM (Ownership-Based Resource Management) is used in Rust sometimes, and is a much better name. I call it RAII in this article though, because that's what most people call it, even in Rust.

The Problem: Manual Memory Management is Hard, GC is "Slow"

So. C-style manual memory management -- "just call free when you're done with the allocation" -- is error prone.

It is error prone when it is easy and tedious, because programmers can make stupid mistakes and just forget to write free and it isn’t immediately broken. It is error prone when multiple programmers work together, because they might make different assumptions about who is supposed to free something. It is error prone when multiple parts of the code need to use the same data, especially when that usage changes with new requirements and new features.

And the consequences of doing it wrong are not just memory leaks. Use-after-free can lead to memory corruption, and bugs in one part of the program can abruptly show up when allocation patterns change somewhere else entirely.

This is a problem that can be solved with discipline, but like many tedious clerical disciplines, it can also be solved by computer.

It can be solved at run-time, which is what garbage collection and reference counting do. These systems do two things:

  • They keep allocations from lasting too long. When memory becomes unreachable, it can be reclaimed. This prevents memory leaks.
  • They keep allocations from being freed early. If memory is still reachable, it will still be valid. This prevents memory corruption.

And for most programmers and applications, this is good enough. And so for almost all modern programming languages, this run-time cost is well worth not troubling the programmer with the error-prone tedious tasks of C-style manual memory management, enabling memory safety and resource efficiency at the same time.

Caveat: To be clear, "slow" here is an oversimplification, and I address that more later. I mean it as a tongue-in-cheek way of saying that it has performance costs, whereas Rust and C++ try to adhere to a zero-cost principle.

GC (including RC) Has Costs

But there are costs to having the computer do memory management at run-time.

I lump mark-sweep garbage collection and reference counting together here. Both mark-sweep garbage collection and reference counting have costs above C-style manual memory management that make them unacceptable according to the zero-cost principle. GC comes with pauses, and additional threads, in the best case. RC comes with myriad increments and decrements to a reference count. These costs might be small enough to be okay for your application -- and that's well and good -- but they are costs, and therefore they can't be the main memory management model in C++ or Rust.

This is a complicated issue, and so before continuing, here comes another caveat:

Caveat: GC is not necessarily slower, but it does have performance implications that are often unacceptable for situations where C++ (or Rust) is used. To achieve its full performance, it needs to be enabled for the entire heap, and that has costs associated with it. For these reasons, C++ and Rust do not use GC. The details of these performance trade-offs are beyond the scope of this blog post.

A Dilemma

But C++ and Rust are not most programming languages. They face a dilemma:

  • On the one hand, manual memory management is unacceptably error prone for a high level language, a detail the computer should be able to handle for you.
  • On the other hand, run-time garbage collection violates a fundamental goal that C++ and Rust share: the zero-cost principle. Code written in these languages is supposed to be as performant as the equivalent manually-written C. To conform to that principle, reference counting (or GC) have to be opt-in (because, after all, sometimes manually written C code does use these technologies).

So, for the vast majority of situations, where a C programmer wouldn't use reference counting (or mark-sweep), Rust and C++ need something more sophisticated. They need tools to prevent memory management mistakes -- that is, to at least partially automate this tedious and error-prone task -- without sacrificing any run-time performance.

And this is the reason C++ invented (and Rust appropriated) RAII. Instead of addressing the problem at run-time, RAII automates memory management at compile-time. Analogous to how templates and trait monomorphization can bring some but not all of the power of polymorphism without many of the run-time costs, RAII brings some but not all of the power of garbage collection without constant reference count updates or GC pauses.

But as we will see, RAII as C++ implements it only solves one of the two problems addressed by garbage collection: leaks. It cannot address memory corruption; it cannot keep allocations alive long enough for all the code that could possibly need to use it.

Raw RAII: How RAII Works on its Own

The simplest use case for RAII is underwhelming: it automatically inserts calls to free up heap allocations at the end of the block where we made the allocation. It replaces a malloc/free sandwich from C with simply the allocation side, by inserting an implicit (and unwritten) call to a destructor, which in its simplest version is an equivalent of free. And if that was all RAII did, it wouldn't be that interesting.

For example, take this C-style (no RAII) code:

void print_int_little_endian_decimal(int foo) {
    // Little endian decimal print of `foo`
    // i.e. backwards from how we normally write decimal numbers
    // e.g. 831 prints out as "138"

    // Big endian would be too hard
    // Little endian is as always actually simpler platonically,
    // if somehow not for humans.

    // Yes, this only works for positive ints. It's an example.

    char *buffer = malloc(11);
    for(char *it = buffer; it < buffer + 10; ++it) {
        *it = '0' + foo % 10;
        foo /= 10;
        if (foo == 0) {
            it[1] = '\0';
            break;
        }
    }
    puts(buffer); // put-string, not the 3sg verb form "puts"
    free(buffer); // Don't forget to do this!
}

Just using RAII (and unique_ptrs, which are an essential part of the RAII model), but using no other features of C++, we get this very unidiomatic and unimpressive version:

void print_int_little_endian_decimal(int foo) {
    std::unique_ptr<char[]> buffer{new char[11]};
    for(char *it = &buffer[0]; it < &buffer[10]; ++it) {
        *it = '0' + foo % 10;
        foo /= 10;
        if (foo == 0) {
            it[1] = '\0';
            break;
        }
    }
    puts(&buffer[0]);
}

It doesn't help us with our random guess of an appropriate buffer size, our awkward redundant attempts to avoid a buffer-overflow, or with any abstraction over the fact that we're trying to implement a collection.

In fact, it makes the code more awkward, for a benefit that seems hardly worth it, to just automatically call free at the end of the block -- which might not even be where we want to call free! We could instead have wanted to return the data to the caller, or inserted it into a bigger, greater data structure, or similar.

It's a bit less ugly when you use C++'s abstractions. Destructors don't have to just call free (or rather its C++ analogue delete) as unique_ptr's does. Any C programmer can tell you that idiomatic C code is rife with custom free functions to free all of the allocations of a data structure, and C++ (and Rust) will choose which destructor to call for you based on the type of the data. Calling free when a custom destructor must be called is a common careless mistake in C. This is true especially among beginners, and (hot take!) making programming languages less needlessly tricky for beginners is a good thing for everybody.

We can combine RAII with other features of C++ to get this more idiomatic code, with the first do-while loop I've written in years:

void print_int_little_endian_decimal(int foo) {
    std::string res;
    do {
        res += '0' + foo % 10;
        foo /= 10;
    } while (foo != 0);
    std::cout << res << std::endl;
}

Does std::string allocate memory on the heap? Maybe it only does if the string goes above a certain size. But the custom destructor, ~std::string, will call delete[] only when the allocation was actually made, abstracting that question away, along with handling terminating nuls and avoiding overruns in a cleaner way.

This ability of RAII -- to call custom destructors that abstract away allocation decisions -- gets more impressive when we consider that many data structures don't make just 0 or 1 heap allocations, but whole complicated trees of complicated heap allocations. In many cases, C++ (and Rust) will write your destructors for you, even for complicated types like this:

struct PersonRecord {
    std::string name;
    uint64_t salary;
};

std::unordered_map<std::string, std::vector<PersonRecord>> thing;

To destroy thing in C, you'd have to loop through the hash map, free all the keys, and then free all the values, which then requires freeing all the strings in each PersonRecord before freeing the backing for each vector. Only then could you free the actual allocations backing the hash map.

And perhaps a C-based hash map library could do this for you, but only by assuming that the keys are strings, and then taking a function pointer to know how to free the values, which would ironically be a form of dynamic polymorphism and therefore a performance hit. And the function to free the values would then still have to manually free the string, knowing which field of the PersonRecord was a pointer and duplicating that information between the structure and the manually-written "free" function, and still likely not supporting the small-string optimization that C++ enables.

In C++, this freeing code is all automatically generated. PersonRecord gets an automatic destructor that calls the destructor of each field (int's destructor is trivial), and the destructors of std::unordered_map and std::vector are templated so that, at compile time, a fresh destructor is built from those templates that handles all of this, all without any indirect function calls or run-time cost beyond what manually would be written for exactly this data structure in C.

See, with RAII, a destructor isn't just automatically and implicitly called at the end of a scope in a function, but also in the destructors of values ("objects" in C++) that own other values. Even if you do write a custom destructor for aggregate types, that just specifies what the computer should do on destruction beyond the automatic calls to the destructors of the fields, which are still implicit.

Ownership and its limitations

This is all possible based on the concept of "ownership," one of the key principles of RAII. The key assumption is that every allocation has one owner at any given time. Allocations can own each other (forming a tree of allocations), or a scope can own an allocation (forming the root of such a tree). RAII then can make sure the allocation ends when its owner does -- by the scope exiting, or when the owning object is destroyed.

But what if the allocation needs to outlive its parent, or its scope? It's not always the case that a function has primitive types as its arguments and return value, and then only constructs trees of allocations privately. We need to take these sophisticated collections and pass them as arguments to functions. We need to have them be returned from functions.

This becomes apparent if we try to refactor our big-endian integer decimalizer to allow us to do other things with the resultant string besides print it:

std::string render_int_little_endian_decimal(int foo) {
    std::string res;
    do {
        res += '0' + foo % 10;
        foo /= 10;
    } while (foo != 0);
    return res;
}

int main() {
    std::cout << render_int_little_endian_decimal(3781) << std::endl;
    return 0;
}

Based on our previous discussion of RAII, you might assume that the ~std::string destructor is called on the end of its scope, rendering the allocation unusable for later printing, but instead this code "Just Works."

We've hit one of many mitigations against the limitations of raw RAII that are necessary for it to work. This mitigation is the "Named Return Value Optimization (NRVO)," which stipulates that if a named variable is used in all of the return statements in a function, it is actually constructed (and destructed) in the context of the caller. It is misnamed an "optimization" because it's actually part of the semantics: It eliminates entirely the call to the destructor at the end of the scope, even if that destructor call would have side effects.

This is just one of many ways RAII is made competitive with run-time garbage collection, and we can have values that live outside of a certain scope of a function. This one is narrow and peculiar to C++, but many of the others lead to interesting comparisons. In the next section, we discuss the others.

Filling the Gaps in RAII

Copying/Cloning

We're going to start with one of the oldest of these: copying. When C++ was designed, the intention was that the programmer would not see a difference between types that don't involve allocation (like int or double) and types that do (like std::string or std::unordered_map<std::string, std::vector<std::string>>.

When a function takes an int argument, as in print_int_little_endian_decimal, that integer is copied. Similarly, if we take a std::string argument without additional annotation, C++ will also make a copy:

int parse_int_le(std::string foo) {
    int res = 0;
    int pos = 1;
    for (char c: foo) {
        res += (c - '0') * pos; // No input validation -- example!
        pos *= 10;
    }
    return res;
}

int main(int argc, char **argv) {
    std::string s = argv[1];
    std::cout << parse_int_le(s) << std::endl;
    return 0;
}

This is indeed consistent. Treating ints and std::string objects in parallel ways is also in line with how higher-level programming languages sometimes work: a string is a value, an int is a value, why not give them the same semantics? Aliasing is confusing, why not avoid it with copying?

It's made to work by an implicit function call. Just like destructor calls are implicit in C++, copying also calls a function in the types implementation. Here, it calls std::string's "copy constructor."

The problem here is that this is slow. Not only is an unnecessary copy made, but an unnecessary allocation and deallocation creep in. There is no reason not to use the same allocation the caller already has, here in s from the main function. A C programmer would never write this copying version.

The only reason this feature is allowed under C++'s zero-cost principle is because it is optional. It may be the default -- and making it the default is one of the most questionable decisions C++ ever made -- but we can still alias if we want to. It just takes more work.

Rust, as you can guess by my tone, requires explicit annotation to copy types that have an allocation. In fact, Rust doesn't even use the term "copy," which is reserved for types that can be copied without allocations. It calls this cloning, and requires use of the clone() method to accomplish it.

Some types don't use an allocation, and "copying" them is just a simple memory copy. Some types do use an allocation, and "cloning" them requires allocating. This distinction is important and fundamental to how computers work. It's relevant and visible in Java and even Python, and pretending it doesn't exist is unbecoming for a systems programming language like C++.

Moves

Returning an allocation from a function can't always use NRVO. So if you want your value to outlast your function, but it's created inside the function (and therefore "owned" by the function scope), what you really need is a way for the value to change owners. You need to be able to move the value from the scope into the caller's scope. Similarly, if you have a value in a vector, and need to remove the last value, you can move it.

This is distinct from copying, because, well, no copy is made -- the allocation just stays the same. The allocation is "moved" because the previous scope no longer has responsibility for destroying the allocation, and the new scope gains the responsibility.

Move semantics fix the most serious issue with RAII: your allocation might not live exactly as long as its owner. The root of an allocation tree might outlive the stack-based scope it's in, such as when you want to return a collection from a function. The other nodes of an allocation tree might leave that tree and be owned by another stack frame, or by another part of the same allocation tree, or by a different allocation tree. In general, "each allocation has a unique owner" becomes "each allocation has a unique owner at any given time," which is much more flexible.

In Rust, this is done via "destructive moves," which oddly enough means not calling the destructor on the moved-from value. In fact, the moved-from value ceases to be a value when it's moved from, and accessing that variable is no longer permitted. The destructor is then called as normal in the place where the value is moved to. This is tracked statically at compile-time in the vast majority of situations, and when it cannot be, an extra boolean is inserted as a "drop flag" ("drop" is how Rust refers to its destructors).

C++ didn't add move semantics until C++11; it was not part of the original RAII scheme. This is surprising given how essential moves are to RAII. Returning collections from functions is super important, and you can't copy every time. But before C++, there were only poor man's special cases for move, like NRVO and the related RVO for objects constructed in the return statement itself. These have completely different semantics than C++ move semantics -- they're still more efficient than C++ moves in many cases.

When C++ did eventually add moves, the other established semantics of C++ forced it to add moves in a weird and deeply confusing way: it added "non-destructive" moves. In C++, rather than the drop flag being a flag inserted by the compiler, it is internal to the value. Every type that supports moves must have a special "empty state," because the destructor is called on the moved-from value. If the allocation had moved to another value, there would be no allocation to free, and this had to be handled by the destructor at run-time, which can amount to a violation of the zero-cost principle in some situations.

C++ justifies this by making moves a special case of copy. Moves are said to be like copies, but make no promises of preserving the initial value. In exchange, you might get the optimization of being able to use the original allocation, but then the initial value will not have an allocation, and will be forced to be different. This definition is very different than what moves are actually used for (cf. the name of the operation), and therefore, even though it is technically simple, claiming that focusing on that definition (as Herb Sutter does) will simplify things for the programmer is disingenuous, as I discuss in more detail in the next chapter on moves specifically.

In practice, this means that all types support the operation of moving -- even ints -- but even some types that manage an allocation might fall back on copying if moves haven't been implemented for them. This inconsistency, like all inconsistencies, is bad for programmers.

In practice, this also means that moved-from objects are a problem. A moved-from object might stay the same, if no moving was done. It might also change in value, if the move caused an allocation (or other resource) to move into the new object. This forces C++ smart pointers to choose between movability and non-nullability -- no moveable, non-nullable pointer is possible in C++. Nulls -- and the other "moved-from" empty collections that you get from C++ move semantics -- can then be referenced later on in the function, and though they must be "valid" values of the object, they are probably not the values you expect, and in the case of null pointers, they are famously difficult values to reason about.

This is a consequence of the fact that C++ was a pioneer of RAII semantics, and didn't design RAII and moves together from the start. Rust has the advantage of having included moves from the beginning, and so Rust move semantics are much cleaner.

In Rust also, all types can be moved. But in Rust, no resources or allocations are ever copied. Instead, moves always have the same implementation: copy the memory that is stored in-line in the value itself, and then do not call the destructor. For copyable types like int that do not manage an allocation or other resource, this does amount to a copy, but the original is still not usable. But no allocation or resource is ever copied; for those types, the pointer or handle is simply brought along bit-by-bit just like other data, and the old value is never touched again, making this a safe operation.

All types must then be written in such a way to assume that values might not stay in the same place in memory. If some operations on a type can't be written that way, they can be defined on "pinned" versions of that type. A pin is a type of reference or box that promises that the pointed-to value will never move again. The underlying type is still movable, but these particular values are not.

This is a gnarly exception to Rust's "all types can be moved" rule that make it false in practice, though still true in pedantic, language-lawyery theory. But that's not important. What is important is that Rust's move semantics are consistent, and do not rely on move constructors and manual implementations of Rust's drop flags within the object. The dangerous possibility of interacting with a moved-from object, whose value is unpredictable and quite possibly a special "empty" state like null, is not present in Rust.

Borrows in Rust

While moves cover returning a collection (or other resource-managing value) from a function, they don't cover passing such a value into a function, or at least not in the general case. Sometimes, when we pass a value into a function, we want to move the value in, so that the function can consume it or add it to an allocation tree (like inserting into a collection). But most times, we want the function to be able to see and perhaps mutate it, but then we want to give it back to the owner.

Enter the borrow.

In Rust, borrows are commonly introduced as a sort of an improvement on moves. Consider our example function that parses a string to an int, here implemented in C++ with copies:

int parse_int_le(std::string foo) {
    int res = 0;
    int pos = 1;
    for (char c: foo) {
        res += (c - '0') * pos; // No input validation -- example!
        pos *= 10;
    }
    return res;
}

Here is a Rust version, with moves, so that the function consumes the string:

use std::env::args;

fn parse_int_le(foo: String) -> u32 {
    let mut res = 0;
    let mut pos = 1;
    for c in foo.chars() {
        res += (c as u32 - '0' as u32) * pos;
        pos *= 10;
    }
    res
}

fn main() {
    let mut args: Vec<String> = args().collect();
    println!("{}", parse_int_le(args.remove(1)));
}

As we can see with the "move" version of this, we are in the awkward position of removing the string from the vector, so that parse_int_le can consume the string, so it doesn't have multiple owners.

But parse_int_le doesn't need to own the string. In fact, it could be written so that it can give the string back when it's done:

#![allow(unused)]
fn main() {
fn parse_int_le(foo: String) -> (u32, String) {
    let mut res = 0;
    let mut pos = 1;
    for c in foo.chars() {
        res += (c as u32 - '0' as u32) * pos;
        pos *= 10;
    }
    (res, foo)
}
}

"Taking temporary ownership" in real life is also known as borrowing, and Rust has such a feature built-in. It is more powerful than the above code that literally takes temporary ownership, though. That code would have to remove the string from the vector and then put it back -- which is even more inefficient than just removing it. Rust borrowing allows you to borrow it even while it's inside the vector, and stays inside the vector. This is implemented by a Rust reference, which has this borrowing semantics, and is, like most "references," implemented as a pointer at the machine level.

In order to accomplish these semantics, Rust has its infamous borrow checker. While we are borrowing something inside the vector, we can't simultaneously be mutating the vector, which could cause the thing we're borrowing to move. Rust statically ensures that this is impossible, rejecting code that use a reference after a mutation, destruction, or move somewhere else would invalidate it.

This enables us to extend the RAII-based system and both prevent leaks and maintain safety, just like a GC or RC-based system. The borrow checker is essential to doing so.

For completeness, here is the idiomatic way to handle the parameter in parse_int_le, with an actual borrow, using &str, the special borrowed form of String that also allows slices:

use std::env::args;

fn parse_int_le(foo: &str) -> u32 {
    let mut res = 0;
    let mut pos = 1;
    for c in foo.chars() {
        res += (c as u32 - '0' as u32) * pos;
        pos *= 10;
    }
    res
}

fn main() {
    let args: Vec<String> = args().collect();
    println!("{}", parse_int_le(&args[1]));
}

Dodging memory safety in C++

In C++, of course, there is no borrow checker. In the parse_int_le example, it's still possible to use a pointer, or a reference, but then you're on your own. When RAII-based code frees your allocation, your reference is invalidated, which means it's undefined behavior to use it. No coordination is performed by the compiler between the RAII/move system and your references, which point into the ownership tree with no guarantee that said tree won't move underneath it. This can lead to memory corruption bugs, with security implications.

It's not just pointers and references. Other types that contain references, such as iterators, can also be invalidated. Sometimes those are more insidious because intermediate C++ programmers might know about pointer invalidation, but let their guard down with iterators. If you add to a vector while looping through it, you've just done undefined behavior, and that's surprising because no pointers or references even have to show up. Rust's borrow checker handles these as well.

Even though the Rust borrow checker gets a bad reputation, its safety guarantees often make it worth it. It's hard to write correct C++ when references and non-owning pointers are involved. Maybe some of you have that skill, and are unsympathetic to those who don't yet have it, but it is a specialized skill, and the compiler can do a lot of the work for you, by checking your work. Automation is a good thing, and so is making systems programming more accessible to beginners.

And of course, many C++ programmers do make mistakes. Even if it's not you, it might be one of your colleagues, and then you'll have to clean up the mess. Rust addresses this, and limits this more difficult mode of thinking to writing unsafe code, which can be contained in modules.

Multiple Ownership

In RAII, an allocation has one owner at a time, and if your owner is destroyed before the allocation is moved to another owner, the allocation must be destroyed along with it.

Of course, sometimes this isn't how your allocations work. Sometimes they need to live until both of two parent allocations are destroyed, and sometimes there is no way to predict which parent is destroyed first. Sometimes, the only way to solve that situation -- even in C -- is to use runtime information -- and so you can model multiple ownership through reference counting: std::shared_ptr in C++, or Rc and Arc in Rust (depending on whether it is shared between multiple threads).

This is something that C programmers will sometimes do in the face of complicated allocation DAGs, and end up implementing bespoke on a framework-by-framework basis (cf. GTK+ and other C GUI frameworks). C++ and Rust are just standardizing the implementation of this, but, in line with the zero-cost rule, making it optional.

Interestingly enough, reference counting is implemented in terms of RAII and moves. The destructor for a reference-counted pointer decreases the reference, and cloning/copying such a pointer increases it. Moves, of course, don't change it at all.

RAII+: What this all adds up to

Between RAII, moves, reference counting, and the borrow checker, we now have the memory management system of safe Rust. Safe Rust is a powerful programming language, and in it, you can write programs almost as easily as in a traditionally GC'd programming language like Java, but get the performance of manually written, manually memory managed C.

The cost is annotation. In Java, there is no distinction between "borrowing" and "owning", even though sometimes the code follows similar structures as if there were. In Rust, the compiler must be informed about the chain of owners, and about borrowers. Every time an allocation crosses scope boundaries or is referred to inside another allocation, you must write different syntax to tell Rust whether it's a move or a borrow, and it must comply with the rules of the borrow checker.

But it turns out most code has a natural progression of owners, and most borrows are valid in the borrow checker. When they're not, it's usually straight-forward to rethink the code so that it can work that way, and the resultant code is usually cleaner anyway. And in situations where neither of them work, reference counting is still an option.

At the cost of this annotation, Rust gives you everything a GC does: Allocations are freed when their handles go out of scope, and memory safety is still guaranteed, because the annotations are checked. Memory leaks are as difficult as in a reference counting language, and the annotations are checked, which is most of the benefit of automating them. It's an excellent happy medium between manual memory management and full run-time GC with no run-time cost over a certain discipline of C memory management.

Of course, other disciplines of C memory management are possible. And using this Rust system takes away flexibility that might be relevant to performance. Rust, like C++, allows you to sidestep the "compile-time GC" and use raw pointers, and that can often be better for performance. A recent blog post I read explores some of that in more detail; encouragingly, that blog post also considers RAII to be in-between manual memory management and run-time GC -- serendipitously, because I had already drafted much of this post when it came out.

But the standard memory management tools of Rust cover the common cases well, and unsafe is available for when it's inappropriate -- and can be wrapped in abstractions for interfacing with code that uses the RAII-based system.

In C++, the annotations of "borrows" vs "moves" can easily result in undefined behavior. Leaks are prevented, but memory corruption is not. So the C++ system is a much worse replacement for garbage collection -- RAII is only doing some of its job, as it is not paired with a borrow checker.

Cycles

I leave the most awkward topic for the end. We've talked about allocation trees and DAGs, but not general graphs. These require unsafe in Rust, even something as supposedly basic as doubly linked lists. It's against the borrow checker's rules, and the compiler will statically prevent you from making them using safe, borrowing references. They simply aren't borrows in the Rust sense, but are rather something else, something about which Rust doesn't know how to guarantee safety.

This is not as bad as you might think, because cycles also form a hole in reference counting, which is a popular run-time GC system. This is why you can't use Rc or Arc to implement a doubly-linked list correctly in Rust either: You'll get past the borrow checker and guarantee a memory leak.. These systems generally can't detect cycles at all, and leak them, which is arguably worse than forbidding them to be created.

In any case, the unsafe keyword is not poison. For things that Rust doesn't know how to keep safe, you need to exercise extra responsibility, but at least the programming language is making you aware of it -- unlike C++, which is unsafe all the time.

Moves

As we discussed in the previous chapter, moves are an essential part of an RAII-based system of memory management, allowing RAII-controlled types to have multiple owners in the course of their lifetime. In this chapter, we discuss moves outside of that context and provide an alternative justification for why they're important. We then go into a little more depth about why C++ moves can be confusing, and explain how the Rust implementation has fewer footguns and in general is more in line with the goals of the feature.

History

In 2011, C++ finally fixed a set of long-standing deficits in the programming language with the shiny new C++11 standard, bringing it into the modern era. Programmers enthusiastically pushed their companies to allow them to migrate their codebases, champing at the bit to be able to use these new features. Writers to this day talk about "modern C++," with the cut-off being 2011. Programmers who only used C++ pre-C++11 are told that it is a new programming language, the best version of its old self, worth a complete fresh try.

There were a lot of new features to be excited about. C++ standard threads were added then -- and thread standardization was indeed good, though anyone who wanted to use threads before likely had their choice of good libraries for their platform. Closures were also very exciting, especially for people like me who came from functional programming, but to be honest, closures were just syntactic sugar for existing patterns of boilerplate that could be readily used to write function objects.

Indeed, the real excitement at the time, certainly the one my colleagues and I were most excited about, was move semantics. To explain why this feature was so important, I'll need to talk a little about the C++ object model, and the problem that move semantics exist to solve.

Value Semantics

Let's start by talking about a primitive type in C++: int. Objects -- in C++ standard parlance, int values are indeed considered objects -- of type int only take up a few bytes of storage, and so copying them has always been very cheap. When you assign an int from one variable to another, it is copied. When you pass it to a function, it is copied:

int print_i(int arg) {
    arg += 3;
    std::cout << arg << std::endl;
}

int foo = 3;
int bar = foo; // copy
foo += 1; // foo gets 4
std::cout << bar << std::endl; // bar is still 3
print_i(foo); // prints 4+3 ==> 7
std::cout << foo << std::endl; // foo is still 4

As you can see, every variable of type int acts independently of each other when mutated, which is how primitive types like int work in many programming languages.

In the C++ version of object-oriented programming, it was decided that values of custom, user-defined types would have the same semantics, that they would work the same way as the primitive types. So for C++ strings:

std::string foo = "foo";
std::string bar = foo; // copy (!)
foo += "__";
bar += "!!";
std::cout << foo << std::endl; // foo is "foo__"
std::cout << bar << std::endl; // bar is "foo!!"

This means that whenever we assign a string to a new variable, or pass it to a function, a copy is made. This is important, because the std::string object proper is just a handle, a small structure that manages a larger memory allocation on the heap, where the actual string data is stored. Each new std::string that is made via copy requires allocating a new heap allocation, a relatively expensive operation in performance.

This would cause a problem when we want to pass a std::string to a function, just like an int, but don't want to actually make a copy of it. But C++ has a feature that helps with that: const references. Details of the C++ reference system are a topic for another post, but const references allow a function to operate on the std::string without the need for a copy, but still promising not to change the original value.

The feature is available for both int and std::string; the principle that they're treated the same is preserved. But for the sake of performance, ints are passed by value, and std::strings are passed by const reference in the same situation. In practice, this dilutes the benefit of treating them the same, as in practice the function signatures are different if we don't want to trigger spurious expensive deep copies:

void foo(int bar);
void foo(const std::string &bar);

If you instead declare the function foo like you would with an int, you get a poorly performing deep copy. The default is something you probably don't want:

void foo(std::string bar);
void foo2(const std::string &bar);
`
std::string bar("Hi"); // Make one heap allocation
foo(bar); // Make another heap allocation
foo2(bar); // No copy is made

This is all part of "pre-modern" C++, but already we're seeing negative consequences of the decision to treat int and std::string as identical when they are not, a decision that will get more gnarly when applied to moves. This is why Rust has the Copy trait to mark types like i32 (the Rust equivalent of int) as being copyable, so that they can be passed around freely, while requiring an explicit call to clone() for types like String so we know we're paying the cost of a deep copy, or else an explicit indication that we're passing by reference:

#![allow(unused)]
fn main() {
fn foo(bar: String) {
    // Implementation
}

fn foo2(bar: &str) {
    // Implementation
}

let bar = "hi".to_string();
foo(bar.clone());
foo2(&bar);
}

The third option in Rust is to move, but we'll discuss that after we discuss moves in C++.

Copy-Deletes and Moves

C++ value semantics break down even more when we do need the function to hold onto the value. References are only valid as long as the original value is valid, and sometimes a function needs it to stay alive longer. Taking by reference is not an option when the object (whether int or std::string) is being added to a vector that will outlive the original object:

std::vector<int> vi;
std::vector<std::string> vs;
{
    int foo = 3;
    foo += 4;
    vi.push_back(foo);
} // foo goes out of scope, vi lives on
{
    std::string bar = "Hi!";
    bar += " Joe!";
    vs.push_back(bar);
} // bar goes out of scope, vs lives on

So, to add this string to the vector, we must first make an allocation corresponding to the object contained in the variable bar, and then must make a new allocation for the object that lives in vs, and then copy all the data.

Then, when bar goes out of scope, its destructor is called, as is done automatically whenever an object with a destructor goes out of scope. This allows std::string to free its heap allocation.

Which means we copied an allocation into a new heap allocation, just to free the original allocation. Copying an allocation and freeing the old one is equivalent to just re-using the old allocation, just slower. Wouldn't it make more sense to make the string in the vector just refer to the same heap allocation that bar formerly did?

Such an operation is referred to as a "move," and the original C++ -- pre C++11 -- didn't support them. This was possibly because they didn't make sense for ints, and so they were not added for objects that were trying to act like ints -- but on the other hand, destructors were supported and ints don't need to be destructed.

In any case, moves were not supported. And so, objects that managed resources -- in this case, a heap allocation, but other resources could apply as well -- could not be put onto vectors or stored in collections directly without a copy and delete of whatever resource was being managed.

Now, there were ways to handle this in pre-C++11 days. You could add an indirection, and make a heap allocation to contain the std::string object, which is only a small object with a pointer to another allocation, but would at least let you pass around a std::string * which is a raw pointer that would not trigger all these copies by automatically managing the heap allocation with this façade of value semantics. Or you could manually manage a C-style string with char *.

But the most ergonomic, clear std::vector<std::string> could not be used without performance degradation. Worse, if the vector ever needed to be resized, and had to itself switch to a different allocation, it would have to copy all those std::string objects internally and delete the originals, N useless reallocations.

As a demonstration of this, I wrote a sample program with a vastly simplified version of std::string, that tracks how many allocations it makes. It allows C++11-style moves to be enabled or disabled, and then it takes all the command line arguments, creates string objects out of them, and puts them in a vector. For 8 command line arguments, the version with move made, as you might expect, 8 allocations, whereas the version without the move, that just put these strings into a vector, made 23. Each time a string was added to a vector, a spurious allocation was made, and then N spurious allocations had to be made each time the vector doubled.

This problem is purely an artifact of the limitations of the tools provided by C++ to encapsulate and automatically manage memory, RAII and "value semantics."

Consider this snippet of code:

std::vector<std::string> vec;
{ // This might take place inside another function
  // Using local block scope for simplicity
    std::string foo = "Hi!";
    vec.push_back(foo);
}
{
    std::string bar = "Hello!";
    vec.push_back(bar);
}
// Use the vector

If we didn't use this string class, we would then have not done a copy, just to free the original allocation. We would have simply put the pointer into the vector. We would then have been responsible for freeing all the allocations -- once -- when we're done:

std::vector<char *> vec;
{
    // strdup, a POSIX call, makes a new allocation and copies a
    // string into it, here used to turn a static string into one
    // on the heap. We will assume we have a reason to store it
    // on the heap -- perhaps we did more manipulation in the
    // real application to generate the string.

    // The allocation is necessary to be the direct equivalent of
    // `vec.push_back("Hi")` or even `vec.emplace_back("Hi")` for
    // a `std::vector<std::string>, because that data structure has
    // the invariant that all strings in the vector must have their
    // own heap allocation (assuming no small string optimization,
    // which many strings are ineligible for).

    char *foo = strdup("Hi!");
    vec.push_back(foo);
}
{
    char *bar = strdup("Hello!");
    vec.push_back(bar);
}

// Use the vector

// Then, later, when we are done with the vector, free all the elements once
for (char *c: vec) {
    free(c);
}

The copy version of the C++ code instead does -- after de-sugaring the RAII and value semantics and inlining -- something that no programmer would ever write manually, something equivalent to this (the vector is left in OOP notation for readability):

std::vector<char *> vec;
{
    char *foo = strdup("Hi");
    vec.push_back(strdup(foo)); // Why the additional allocate-and-copy?
    free(foo); // Because the destructor of foo will free the original
}
{
    char *bar = strdup("Hello!");
    vec.push_back(strdup(bar));
    free(bar);
}

// Use the vec
for (char *c: vec) {
    free(c);
}

C++ without move semantics fails to reach its goal of zero-cost abstraction. The version with the abstraction, with the value semantics, compiles to code less efficient than any code someone would write manually, because what we really want is to allocate the allocation while it's a local variable foo, use the same allocation on the vector, and then only free it on the vector.

The abstractions of only supporting "copy" and "destruct" mean that the destructor of the variable foo must be called when foo goes out of scope. This means that the "copy" operation must make an independent allocation, as it cannot control when the original goes out of scope, or will be replaced with another value. If we had instead re-used the same allocation, it would be freed by foos destructor.

But copying just to destroy the original is silly -- silly and ill-performant. What any programmer would naturally write in that situation results in a "move". So this gap -- and it was a huge gap -- in C++ value semantics was filled in C++11 when they added a "move" operation.

Because of this addition, using objects with value semantics that managed resources became possible. It also became possible to use objects with value semantics for resources that could not meaningfully be copied, like unique ownership of an object or a thread handle, while still being able to get the advantages of putting such objects in collections and, well, moving them. Shops that previously had to work around value semantics for performance reasons could now use them directly.

It is not, therefore, surprising that this was for many the most exciting change in C++11.

How Move Is Implemented in C++

But for now, let's put ourselves in the place of the language designers who designed this new move operation. What should this move operation look like? How could we integrate it into the rest of C++?

Ideally, we would want it to output -- after inlining -- exactly the code that we would expect to write manually. When foo is moved into the vector, the original allocation must not freed. Instead, it is only freed when the vector itself is freed. This is an absolute necessity to solve the problem as we must remove a free in order to remove the allocation, but we also cannot leak memory. If there is to be exactly one allocation, there must be exactly one deallocation.

Calls to free (or delete[] in my example program) are made in the destructor, so the most straight-forward way to go forward is to say that the destructor should only be called when the vector is destroyed, but not when foo goes out of scope. If foo is moved onto the vector, then the compiler should take note that it has been moved from, and simply not call the destructor. The move should be treated as having already destroyed the object, as an operation that accomplishes both initialization of the new object (the string on the vector) from the original object and the destruction of the original object.

This notion is called "destructive move," and it is how moves are done in Rust, but it is not what C++ opted for. In Rust, the compiler would simply not output a destructor call (a "drop" in Rust) for foo because it has been moved from. But, in fact, the C++ compiler still does. In destructive move semantics, the compiler would not allow foo to be read from after the move, but in fact, the C++ compiler still does, not just for the destructor, but for any operation.

So how is the deallocation avoided, if the compiler doesn't remove it in this situation? Well, there is a decision to make here. If an object has been moved from, no deallocation should be performed. If it has not, a deallocation should be performed. Rust makes this decision at compile-time (with rare exceptions where it has to add a "drop flag"), but C++ makes it at run-time.

When you write the code that defines what it means to move from an object in C++, you must make sure the original object is in a run-time state where the destructor will still be called on it, and will still succeed. And, since we established already that we must save a deallocation by moving, that means that the destructor must make a run-time decision as to whether to deallocate or not.

The more C-style post-inlining code for our example would then look something like this:

std::vector<char *> vec;
{
    char *foo = strdup("Hi!");
    vec.push_back(foo);
    foo = nullptr;
    if (foo != nullptr) {
        free(foo);
    }
}
{
    char *bar = strdup("Hi!");
    vec.push_back(bar);
    bar = nullptr;
    if (bar != nullptr) {
        free(bar);
    }
}

This null check is hidden by the fact that in C++, free and delete and friends are defined to be no-ops on null, but it still exists. And while the check might be very cheap compared to the cost of calling free, it might not be cheap when things are moved in a tight loop, where free is never actually called. That is to say, this run-time check is not cheap compared to the cost of not calling free.

So, given the semantics of move in C++, it results in code that is not the same as -- and not as performant as -- the equivalent hand-written C-style code, and therefore it is not a zero-cost abstraction, and doesn't live up to the goals of C++.

Now, it looks like the optimizer should be able to clean up an adjacent set to null and check for null, but not all examples are as simple as this one, and, like in many situations where the abstraction relies on the optimizer, the optimizer doesn't always get it.

Arguing Semantics

But that performance hit is small, and it is usually possible to optimize out. If that were the only problem with C++ move semantics, I might find it annoying, but ultimately I'd say, like about many things in about both C++ and Rust, something like: Well, this decision was made, remember to profile, and if you absolutely have to make sure the optimizer got it in a particular instance, check the assembly by hand.

But there's a few further consequences of that decision.

First off, the resource might not be a memory allocation, and null pointers might not be an appropriate way to indicate that that resource doesn't exist. This responsibility of having some run-time indication of what resources need to be freed -- rather than a one-to-one correspondence between objects and resources -- is left up to the implementors of classes. For heap allocations, it is made relatively easy, but the implementor of the class is still responsible for re-setting the original object. In my example, the move constructor reads:

string(string &&other) noexcept {
    m_len = other.m_len;
    m_str = other.m_str;
    other.m_str = nullptr; // Don't forget to do this
}

The move constructor has two responsibilities, where a destructive version would only have one: It must set up state for the new object, and it must set up a valid "moved from" state for the old object. That second obligation is a direct consequence of non-destructive moves, and provides the programmer with another chance to mess something up.

In fact, since destructive moves can almost always be implemented by just copying the memory (and leaving the original memory as garbage data as the destructor will not be called on it), a default move constructor would correctly cover the vast majority of implementations, creating even fewer opportunities to introduce bugs.

But in C++, the moved-from state also has obligations. The destructor has to know at run-time not to reclaim any resources if the object no longer has any, but in general, there is no rule that moved-from objects must immediately be destroyed. The programming language has explicitly decided not to enforce such a rule, and so, to be properly safe, moved-from objects must be considered -- and must be -- valid values for those objects.

This means that any object that manages a resource now must manage either 1 or 0 copies of that resource. Collections are easy -- moved from collections can be made equivalent to the "empty" collection that has no element. For things like thread handles or file handles, this means that you can have a file handle with no corresponding file. Optionality is imported to all "value types."

So, smart pointer types that manage single-ownership heap allocations, or any sort of transferrable ownership of heap allocations, now of necessity must be nullable. Nullable pointers are a serious cause of errors, as often they are used with the implicit contract that they will not be null, but that contract is not actually represented in the type. Every time a nullable pointer is passed around, you have a potential miscommunication of whether nullptr is a valid value, one that will cause some sort of error condition, or one that may lead to undefined behavior.

C++ move semantics of necessity perpetuate this confusion. Non-nullable smart pointers are unimplementable in C++, not if you want them to be moveable as well.

Move, Complicatedly

This leads me to Herb Sutter's explanation of C++ move semantics from his blog. I respect Herb Sutter greatly as someone explaining C++, and his materials helped me learn C++ and teach it. An explanation like this is really useful if programming in C++ is what you have to do.

However, I am instead investigating whether C++'s move semantics are reasonable, especially in comparison to programming languages like Rust which do have a destructive move. And from that point of view, I think this blog post, and its necessity, serve as a good illustration of the problems with C++'s move semantics.

I shall respond to specific excerpts from the post.

C++ “move” semantics are simple, and unchanged since C++11. But they are still widely misunderstood, sometimes because of unclear teaching and sometimes because of a desire to view move as something else instead of what it is.

Given the definition he's about to give of C++ move semantics, I think this is unfair. The goal of move is clear: to allow resources to be transferred when copying would force them to be duplicated. It is obvious from the name. However, the semantics as the language defines them, while enabling that goal, are defined without reference to that goal.

This is doomed to lead to confusion, no matter how good the teaching is. And it is desirable to try to understand the semantics as they connect to the goal of the feature.

To explain what I mean, see the definition he then gives for moving:

In C++, copying or moving from an object a to an object b sets b to a’s original value. The only difference is that copying from a won’t change a, but moving from a might.

This is a fair statement of C++'s move semantics as defined. But it has a disconnect with the goals.

In this definition, we are discussing the assignment written as b = a or as b = std::move(a). The reason why moving might change a, as we've discussed, is that a might contain a resource. Moving indicates that we do not wish to copy resources that are expensive or impossible to copy, and that in exchange for this ability, we give up the right to expect that a retain its value.

This definition is the correct one to use for reasoning about C++ programs, but it is not directly connected to why you might want to use the feature at all. It is natural that programmers would want to be able to reason about a feature in a way that aligns with its goals.

The goal of this post is to obscure the goal, and to treat move as if it were a pure optimization of copy, which will not help a programmer understand why a's value might change, or why move-only types like std::unique_ptr exist.

The explanation of the goal of this operation is reserved in this post for the section entitled "advanced notes for type implementors".

Of course, almost all C++ programmers in a sufficiently large project have to become "type implementors" to understand and maintain custom types, if not to write fresh implementations of them, so I think most professional programmers should be reading these notes, and so I think it's unfair to call them advanced. But beyond that, this explanation is core to why the operation exists, and the only explanation for why move-only types exist, which all C++ programmers will have to use:

For types that are move-only (not copyable), move is C++’s closest current approximation to expressing an object that can be cheaply moved around to different memory addresses, by making at least its value cheap to move around.

He follows up with an acknowledgement that destructive moves are a theoretical possibility:

(Other not-yet-standard proposals to go further in this direction include ones with names like “relocatable” and “destructive move,” but those aren’t standard yet so it’s premature to talk about them.)

For his purposes, this is extremely fair, but since my purposes are to compare C++ to Rust and other programming languages which have destructive moves, it is not premature for me to talk about them.

This gets more interesting in the Q&A.

How can moving from an object not change its state?

For example, moving an int doesn’t change the source’s value because an int is cheap to copy, so move just does the same thing as copy. Copy is always a valid implementation of move if the type didn’t provide anything more efficient.

Indeed, for reasons of consistency and generic programming, move is defined on all types that can be moved or copied, even types that don't implement move differently than copy.

What makes this confusing in C++, however, is that types that manage resources might be written without an implementation of move. They might pre-date the move feature, or their implementor might not have understood move well enough to implement them, or there might be a technical reason why moving couldn't be implemented in a way that elides the resource duplication. For these types, a move falls back on a copy, even if the copy does significant work. This can be surprising to the programmer, and surprises in programming are never good. More direly, there is no warning when this happens, because the notion of resource management is not referenced in the semantics.

In Rust, a move is always implemented by copying the data in the object itself and then not destructing the original object, and never by copying resources managed by the object, or running any custom code.

But what about the “moved-from” state, isn’t it special somehow?

No. The state of a after it has been moved from is the same as the state of a after any other non-const operation. Move is just another non-constfunction that might (or might not) change the value of the source object.

I disagree in practice. For objects that use move as intended, to avoid copying resources, move will (at least usually) drain its resource. This means that an object that often manages a resource will enter a state in which it is not managing a resource. That state is special, because it is the state when a resource-managing object is doing something other than its normal job, and is not managing a resource. This is not a "special state" by any rigorous definition, but is guaranteed to be intuitively special by virtue of being resource-free. (It is also a special state in that the value is unspecified in general, whereas most of the time, the value is specified.)

Collections can, as I said before, get away with becoming the empty collection in this scenario, but even for those, the empty state is special: It is the only state that can be represented without holding a resource. And many other types of objects cannot even do this. std::unique_ptr's moved-from state is the null pointer, and without these move semantics, it would be possible to design a std::unique_ptr that did not have a null state.

Once std::unique_ptr is forced to be allowed to have null values, it makes sense that there be other ways to create a null std::unique_ptr, e.g. by default-constructing it. But it is the design of move semantics that force it to have a null value in the first place.

Put another way: std::unique_ptr and thread handles are therefore collections of 0 or 1 heap allocation handles or thread handles, and once defined that way, the "empty" state is not special, but it is move semantics that force them to be defined that way.

Does “but unspecified” mean the object’s invariants might not hold?

No. In C++, an object is valid (meets its invariants) for its entire lifetime, which is from the end of its construction to the start of its destruction.... Moving from an object does not end its lifetime, only destruction does, so moving from an object does not make it invalid or not obey its invariants.

This is true, as discussed above. The moved-from object must be able to be destructed, and there is nothing stopping a programmer for instead doing something else with it. Given that, it must be in some state that its operations can reckon with. But that state is not necessarily one that would be valid if move semantics didn't force its conclusion, and so again, we are close to the problem.

Does “but unspecified” mean the only safe operation on a moved-from object is to call its destructor?

No.

Does “but unspecified” mean the only safe operation on a moved-from object is to call its destructor or to assign it a new value?

No.

Does “but unspecified” sound scary or confusing to average programmers?

It shouldn’t, it’s just a reminder that the value might have changed, that’s all. It isn’t intended to make “moved-from” seem mysterious (it’s not).

I disagree firmly with the answer to the last question. "Unspecified" values are extremely scary, especially to programmers on team projects, because it means that the behavior of the program is subject to arbitrary change, but that change will not be considered breaking.

For example, std::string does not make any promises about the contents of a moved-from string. However, a programmer -- even a senior programmer -- may, instead of consulting the documentation, write a test program to find out what the value is of a moved-from string. Seeing an empty string, the programmer might write a program that relies on the string being empty:

std::vector<std::string>
split_into_chunks(const std::string &in) {
    int count = 0;
    std::vector<std::string> res;
    std::string acc;
    for (char c: in) {
        if (count == 4) {
            res.push_back(std::move(acc));
            // Don't need to clear string.
            // I checked and it's empty.
            count = 0;
        }
        acc += c;
    }
}

Of course, you should not do that. A later version of std::string might implement the small string optimization, where strings of below a certain size are not stored in an expensive-to-copy heap resource, but in the actual object itself. In that situation, it would be reasonable to implement move as a copy, which is allowed, and then this program would no longer do the same thing.

But this is a surprise. This is a result of the "unspecified value." And so while it may, strictly speaking, be "safe" to do things with a moved-from object other than destruct them or assign to them, in practice, without documentation to the contrary making stronger guarantees, the only way to get "not surprising" behavior is to greatly limit what you do with moved-from objects.

What about objects that aren’t safe to be used normally after being moved from?

They are buggy....

By this definition, std::unique_ptr should likely be considered buggy, as null pointers cannot be used "normally". Similarly, a std::thread object that does not represent a thread handle. It is only by stretching the definition of "used normally" to include these special "empty values" that std::unique_ptr gets to claim to not be buggy under that definition, although a null pointer simply cannot be used the way a normal pointer can.

Again, this attitude, that a null pointer is a normal pointer, that an empty thread handle is a normal type of thread handle, is adaptive to programming C++. But it will inevitably exist in a programmer's blind spot, as null pointers always have. The "not null" invariant is often expressed implicitly. Many uses of std::unique_ptr are relying on them never being null, and simply leave this up to the programmer to ensure.

Herb Sutter himself discusses this:

Since the problem is that we are not expressing the “not null” invariant, we should express that by construction — one way is to make the pointer member a gsl::not_null<> (see for example the Microsoft GSL implementation) which is copyable but not movable or default-constructible.

In a programming language with destructive moves, it would be possible to have a smart pointer that was both "non-null" and movable. If we need both movability and the ability to express this invariant in the type system, well, C++ cannot help us.

But what about a third option, that the class intends (and documents) that you just shouldn’t call operator< on a moved-from object… that’s a hard-to-use class, but that doesn’t necessarily make it a buggy class, does it?

Yes, in my view it does make it a buggy class that shouldn’t pass code review.

But in a sense, this is exactly what std::unique_ptr is. It has a special state where you cannot call its most important operator, the dereference operator. It only avoids being called buggy because it expands this state so it can be arrived at by other means.

Again, everything Herb Sutter says is true in a strict sense. It is memory-safe to use moved-from objects other than to destroy or assign to them, even if the move operation makes no further guarantees. It simply isn't safe in a broader sense, in that it will have surprising, changeable behavior. It is true that the null pointer is a valid value of std::unique_ptr, but smart pointers that implement move are forced to have such a value.

And therefore, it should not be surprising that these questions come up. The misconceptions that Herb Sutter is addressing are an unfortunate consequence of the dissonance between the strict semantics of the programming language, where his statements are true, and the practical implications of how these features are used and are intended to be used, where the situation is more complicated.

Moves in Rust

So the natural follow-up question is, how does Rust handle move semantics?

First off, as mentioned before, Rust makes a special case for types that do not need move semantics, where the value itself contains all the information necessary to represent it, where no heap allocations or resources are managed by the value, types like i32. These types implement the special Copy trait, because for these types, copying is cheap, and is the default way to pass to functions or to handle assignments:

#![allow(unused)]
fn main() {
fn foo(bar: i32) {
    // Implementation
}

let var: i32 = 3;
foo(var); // copy
foo(var); // copy
foo(var); // copy
}

For types that are not Copy, such as String, the default function call uses move semantics. In Rust, when a variable is moved from, that variable's lifetime ends early. The move replaces the destructor call at the end of the block, at compile time, which means it's a compile time error to write the equivalent code for String:

#![allow(unused)]
fn main() {
fn foo(bar: String) {
    // Implementation
}

let var: String = "Hi".to_string();
foo(var); // Move
foo(var); // Compile-Time Error
foo(var); // Compile-Time Error
}

Copy is a trait, but more entwined with the compiler than most traits. Unlike most traits, you can't implement it by hand, but only by deriving from primitive types that implement copy. Types like Box, that manage a heap allocation, do not implement copy, and therefore structs that contain Box also cannot.

This is already an advantage to Rust. C++ pretends that all types are the same, even though they require different usage patterns in practice. You can pass a std::string by copy just like an int. Even if you have a vector of vectors of strings, you can pass by copy and that's usually the default way to pass it -- moves in many cases require explicit opt-in. For int it's a reasonable default, but for collections types it isn't, and in Rust the programming language is designed accordingly.

If you want a deep copy, you can always explicitly ask for it with .clone():

#![allow(unused)]
fn main() {
fn foo(bar: String) {
    // Implementation
}

let var: String = "Hi".to_string();
foo(var.clone()); // Copy
foo(var.clone()); // Copy
foo(var);         // Move
}

What this actually does is create a clone, or a deep copy, and then move the clone, as foo takes its parameter by move, the default for non-Copy types.

What does a move in Rust actually entail? C++ implements moves with custom-written move constructors, which collections and other resource-managing types have to implement in addition to implementing copying (though automatic implementation is available if building out of other movable types). Rust requires implementations for clone, but for all moves, the implementation is the same: copy the memory in the value itself, and don't call the destructor on the original value. And in Rust, all types are movable with this exact implementation -- non-movable types don't exist (though non-movable values do). The bytes encode information -- such as a pointer -- about the resource that the value is managing, and they must accomplish that in the new location just as well as they did in the old location.

C++ can't do that, because in C++, the implementation of move has to mark the moved-from value as no longer containing the resource. How this marking works depends on the details of the type.

But even if C++ implemented destructive moves, some sort of "move constructor" or custom move implementation would still be required. C++, unlike Rust, does not require that the bytes contained in an object mean the same thing in any arbitrary location. The object could contain a reference to itself, or to part of itself, that would be invalidated by moving it. Or, there could be a data structure somewhere with a reference to it, that would need to be updated. C++ would have to give types an opportunity to address such things.

Safe Rust forbids these things. The lifetime of a value takes moves into account; you can't move from a value unless there are no references to it. And in safe Rust, there is no way for the user to create a self-referential value (though the compiler can in its implementation of async -- but only if the value is already "pinned," which we will discuss in a moment).

But even in unsafe Rust, such things violate the principle of move. Moving is always safe, and unsafe Rust is always responsible for keeping safe code safe. As a result, Rust has a mechanism called "pinning" that indicates, in the type system, that a particular value will never move again, which can be used to implement self-referential values and which is used in async. The details are beyond the scope of this blog post, but it does mean that Rust can avoid the issue of move semantics for non-movable values without ruining the simplicity of its move semantics.

For these rare circumstances, the features of moving can be accomplished by indirection, and using a Box that points to a pinned value on the heap. And there is nothing stopping such types from implementing a custom function which effectively implements a custom move by consuming the pinned value, and outputs a new value, which can then be pinned in a different location. There is no need to muddy the built-in move operation with such semantics.

Practical Implications for C++ Programmers

So, obviously, in light of my blog series, I recommend using Rust over C++. For Rust users, I hope this clarifies why the move semantics are the way they are, and why the Copy trait exists and is so important.

But of course, not everyone has the choice of using Rust. There are a lot of large, mature C++ codebases that are well-tested and not going away anytime soon, and many programmers working on those codebases. For these programmers, here is some advice for the footgun that is C++ move semantics, both based on what we've discussed, and a few gotchas that were out of the scope of this post:

  • Learn the difference between rvalue, lvalue, and forwarding references. Learn the rules for how passing by value works in modern C++. These topics are out of the scope of this blog post, but they are core parts of C++ move semantics and especially how overloading is handled in situations where moves are possible. Scott Meyers's Effective Modern C++ is an excellent resource.
  • Move constructors and assignment operators should always be noexcept. Otherwise, std::vector and many other library utilities will simply ignore them. There is no warning for this.
  • The only sane things to do with most moved-from objects are to immediately destroy it or reset its value. Comment about this in your code! If the class specifically defines that moved-from values are empty or null, note that in a comment too, so that programmers don't get the impression that there are any guarantees about moved-from values in general.

Conclusion

Move semantics are essential to the performance of modern C++. Without them, much of its standard library would become much more difficult to use. However, the specific design of moves in C++:

  • is misaligned with the purpose of moving
  • fails to eliminate all run-time cost
  • surprises programmers, and
  • forces designers of types to implement an "empty-yet-valid" state

Why, then, does C++ use such a definition? Well, C++ was not originally designed with move semantics in mind. Proposals to add destructive move do not interact well with the existing language semantics. One interesting blog post that I found even says, when following through on the consequences of adding destructive move semantics:

... if you try to statically detect such situations, you end up with Rust.

C++ has so many unsafe features and so many existing mechanisms, that this was deemed the most reasonable way to add move semantics to C++, harmful as it is.

And perhaps this decision was unnecessary. Perhaps there was a way -- perhaps there still is a way -- to add destructive moves to C++. But for right now, non-destructive moves are the ones the maintainers of C++ have decided on. And even if destructive moves were added, it's unlikely that they'd be as clean as the Rust version, and the existing non-destructive moves would still have to be supported for backwards-compatibility sake.

In any case, Rust has taken this opportunity to learn from existing programming languages, and to solve the same problems in a cleaner, more principled way. And so, for the move semantics as well as for the syntax, I recommend Rust over C++.

And to be clear, this still has very little to do with the safety features of Rust. A more C++-style language with no unsafe keyword and no safety guarantees could have still gone the Rust way, or something similar to it. Rust is not just a safer alternative to C++, but, as I continue to argue, unsafe Rust is a better unsafe language than C++.

Entries

In this chapter, I will be discussing a specific data structure API: the Rust map API. Maps are often one of the more awkward parts of a collections library, and the Rust map API is top-notch, especially its entry API -- I literally squealed when I first learned about entries.

And as we shall discuss, this isn't just because Rust made better choices than other standard libraries when designing the maps API. Even more so, it's because the Rust programming language provides features that better expresses the concepts involved in querying and mutating maps. Therefore, this chapter is properly included in this book: this discussion serves as a window into some deep differences between C++ and Rust that show why Rust is better.

And for this chapter, specifically, we'll also be discussing Java, so this will be a three-way comparison, between Java, C++ and Rust.

Reading from a Map

So, let's talk about map APIs. But before we get to Entry and friends, let's discuss something a little simpler: getting an item from a map. Let's say we have a sorted map of strings to integers:

  • In Java, TreeMap<String, Integer>
  • In C++, std::map<std::string, int>
  • In Rust, BTreeMap<&str, i32>

Let's also say we have a string "foo", and want to know what integer corresponds to it. Now, if we're always sure that the string we're looking up is always in the map, then we know what we want: we want to get an integer.

But what if we're not sure? There are plenty of situations where we want to read a value corresponding to the key -- or do something else when that key is not present. Maybe the value is a count, and an absent key means 0. Or maybe the absent key means that the user has made a typo, and needs to be informed. Or maybe the map is a cache, and the absent key means we need to read a file or query a database. In all of these cases, we need to know either the value, or the fact that the key is absent.

Let's see how this is handled in our three programming languages, and how fundamental design choices in these programming languages lead to such APIs.

Java get a (Nullable) Reference

A long time ago, Java made an extreme choice in the name of simplicity: It divided all values into a dichotomy of "primitives" and "objects." Primitives are passed around by implicit copy, whereas objects are aliased through many mutable references. Objects always have optionality built in -- any object reference is automatically "nullable," which means you can store the special sentinal/invalid value null in it, the interpretation of which varies wildly. Primitives are not optional in this way.

Also for the sake of simplicity, and very relevantly to the topic at hand, generics are only supported for object types, not primitives. That means that map values can only ever be object types. And that means that our map from strings to integers in Java doesn't use Java's primitive integer type int, but rather this special wrapper/adapter type Integer, which auto-casts to and from int, and which, like any object type, is managed through mutable, nullable references. (At this point, I for one am beginning to suspect they missed the mark on their simplicity).

So what's that mean for our map? How do we find out what value corresponds to "foo" in our map, or else that there is none? Well, the method for this is called get, and that returns the value in question if there is one. And when there isn't? Well, Java here leverages nullability, and returns null when there is no value.

So we can write something like this:

Integer value = map.get("foo");
if (value == null) {
    System.out.println("No value for foo");
} else {
    int i_value = value;
    System.out.println("Value for foo was: " + i_value);
}

So far, so good. But there are problems. And perhaps I'm missing some -- now is a good time to take a second, look at the code, and try to imagine in your mind what problems there may be with this system (you know, besides the fact that I have to use i_ as improvized Hungarian notation due to lack of support in Java for shadowing).

You have some? I'll now list what I've got.

Problem the first: The signature of get doesn't really alert us to the possibility of a value not being in a map. This is the sort of "edge case" that programmers regularly forget to handle; a programmer may know, due to their situation-specific knowledge, that the key ought to be present, and forget to consider that the key might not be.

Compilers of strongly typed languages generally work to ensure that programmers don't miss edge cases like this, don't make simple "thinkos" (typos but with thought) or "stupid mistakes." How's Java hold up? Well, remember how we mentioned that primitives can't be null, but these wrapper types like Integer are coercible to primitives? Well, this compiles without a word of complaint from the compiler:

TreeMap<String, Integer> map = new TreeMap<String, Integer>();

map.put("foo", 3);

int foo = map.get("foo");
System.out.println("int foo: " + foo);

int bar = map.get("bar");
System.out.println("int bar: " + bar);

And what happens at run-time? Similar behavior to Rust's infamous unwrap function. The conversion from the nullable Integer and the non-nullable int crashes when the Integer is in fact null:

int foo: 3
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "java.util.TreeMap.get(Object)" is null
        at test.main(test.java:12)

So you might try to fix this by querying if the key exists first:

TreeMap<String, Integer> map = new TreeMap<String, Integer>();

if (map.containsKey("bar")) {
    int bar = map.get("bar");
    System.out.println("int bar: " + bar);
} else {
    System.out.println("bar not present");
}

But now we've reached problem the second. Unfortunately, even though this looks like it addresses the issue, this won't prevent the crash either. There is nothing stopping you from putting a null into the map, so this code also crashes given the right context:

        TreeMap<String, Integer> map = new TreeMap<String, Integer>();
        map.put("bar", null);
        if (map.containsKey("bar")) {
            int bar = map.get("bar");
            System.out.println("int bar: " + bar);
        } else {
            System.out.println("bar not present");
        }

So for a given key in a Java map, there are actually three possible situations:

  1. The key is absent.
  2. The key corresponds to an integer.
  3. The key corresponds to one of these special null-values.

get can distinguish 2 from 1 and 3, but cannot distinguish between 1 and 3. containsKey can distinguish 1 from 2 and 3, but cannot distinguish 2 from 3. To distinguish all 3 scenarios, and handle all the representable values, you need to call both get and containsKey:

if (map.containsKey("bar")) {
    Integer bar = map.get("bar");
    if (bar == null) {
        System.out.println("bar present and null");
    } else {
        int i_bar = map.get("bar");
        System.out.println("int bar: " + i_bar);
    }
} else {
    System.out.println("bar not present");
}

In addition to this precaution not being enforced to the compiler, it leads to problem the third: We are now querying the map twice. We are walking the tree twice with our containsKey followed by get.

At this point, we find ourselves scrolling through the Map methods in Java's documentation, trying to find a more general solution. getOrDefault might help in some situations -- when there's a value that makes sense as the default. compute might be useful -- if we're OK with modifying the map in the process.

But in general, nothing clean exists to tidy up these problems. And the blame lies squarely on Java's decision to make almost all types -- and all types that can be map values -- nullable.

But wait! -- you might object -- Can't we just maintain an invariant on the map that it contains no null values? If we have a map without null values, all these issues -- well, many of these issues -- dry up.

And this is true. Maintaining such an invariant makes for a much cleaner situation. Pretend you aren't allowed to put nulls in maps, and arrange not to do it.

But, first off, maintaining an invariant like this is easier said than done. Programmers often do this sort of thing implicitly in their head, but it's much better to comment. Either way, you have to trust future programmers -- even future versions of the same programmers -- to know about the invariant, either by intuiting it (all too common) or by reading the relevant comment (which, even if there is one, might not happen). And you have to trust them to not intentionally violate the invariant, and also to not accidentally violate the invariant: Are they sure that all those values they add to the map can never be null?

And second off, somewhat shockingly, sometimes people do assign special meanings to null. I said before null has a wide range of meanings, and it's not uncommon to use null to mean special things. Maybe "not mapped" means "load from cache," but "null" means "there actually is no value and we know it." Or maybe the opposite convention applies. null is frustratingly without intrinsic meaning.

For such situations, programmers should probably compose the map with other types or better yet, write custom types that make the semantics of these situations abundantly clear. But let's not put all the blame on the programmers. If Java had really wanted to protect people from distinguishing these "not mapped" and "mapped to null" situations, Java maps shouldn't have made the distinction representable at all. It's bad programming language design to put features in a library that can only be abused, and it's bad understanding of human nature to then solely blame the programmers for misusing them.

C++: No Nulls No More

So now we move on to C++.

In C++, fewer types are nullable, and non-nullable types like int can be used as the value type of a map. For our map, of type std::map<std::string, int>, we no longer have the trichotomy of "key not present, value null, or value non-null," but the much more reasonable dichotomy of either the key is present and there is an int, or it's absent and there isn't one.

This is, in my mind, the bare minimum a strongly typed language should be able to provide, but after the context of Java it's worth pointing out.

There are three (3) methods in C++ that look like they might be usable as a get operation, an operation where we either get an int value or learn that the key is absent:

See if you can identify which one is the right one to use.

Spoiler alert! It's find, the one whose name superficially looks least like it'll be the right one. at throws an exception if the key is absent, and operator[], the one with the most appealing name, is an eldritch abhomination which we'll discuss and condemn later.

But all well-deserved teasing aside, find is much better than Java's get. It returns a special object -- an iterator -- that can be easily tested to see whether we've found an int, and easily probed to extract the int.

auto it = map.find(key);
if (it == map.end()) {
    std::cout << key << " not present" << std::endl;
} else {
    std::cout << key << " " << it->second << std::endl;
}

This is actually pretty good! The -> operator also serves as a signal to experienced C++ programmers that we're assuming that it is valid: generally -> or * means that the object being operated on is "nullable" in some way.

So when a C++ programmer reads something like this, they have a little bit of warning that they're doing something that might crash:

int foo = map.find(key)->second;

And certainly, they have more warning than the Java programmer with the equivalent Java:

int foo = map.get(foo);

Of course, this is awkward. find returns an iterator, which isn't exactly the type we'd expect for this "optional value" situation. And to determine if the value isn't present, we compare it to map.end(), which is a weird value to compare it to. Nothing about what these things are named is specifically intuitive, and people would be forgiven for using the accursed operator[]. map["foo"] just looks like an expression for doing boring map indexing, doesn't it?

And what does operator[] do, if the key isn't present? It inserts the key, with a default-constructed value. No configuration is possible of what value gets inserted, short of defining a new type for the object values. This is sometimes what you want -- like if your value type has a good default (especially if you defined it yourself), or if you're about to overwrite the value anyway. But in most cases, you want some other behavior if the value is not present -- operator[] doesn't really tell you that it inserted the item, so if you need to make a network query or read a file or print an error, you're out of luck. operator[], as innocuous as it looks, has surprising behavior, and that is not good.

But all in all, as far as getting values goes, as far as querying the map goes, C++ is doing OK. Solid B result on this exam, I think. Decent work, C++. Especially since we just looked at Java.

The Rust Option

So now on to Rust: we want to query our BTreeMap<&str, i32>.

(Or... it might be a BTreeMap<String, i32>, depending on whether we want to own the strings. This is a decision we also have to make in C++ (where we could have used string_views as the keys), but do not have to make in Java. At least in Rust, we know that whichever decision we make, we will not accidentally introduce undefined behavior. But that's a distraction!)

So let's apply the same test to Rust as we've applied before. Here, the method in question is given an obvious name, get rather than find. So let's see how it does in our test, of allowing us to read a value if present, but know if not:

if let Some(val) = map.get(key) {
    println!("{key}: {val}");
} else {
    println!("{key} not present");
}

See, get returns an Option type. Therefore, unlike in C++, we can test for the presence of the value and extract the value inside the same if statement. Unlike in C++, the return value of get isn't a map-specific type, but rather the completely normal way to express a maybe-present value in Rust. This means that if we want to implement defaulting, we get that for free by using the Option type in Rust, which implements that already:

// Let's say missing keys means the count is 0:
let value = *map.get("foo").unwrap_or(&0);

Similarly, calling is_none() or pattern-matching against None is much more ergonomic than comparing an iterator to map.end(). It requires some more intimate knowledge -- or some follow-up reading -- to learn that the concept of "end of collection" and "not found" are for various reasons combined into one in C++.

So while C++ avoids the problematic elements of Java maps, Rust does so more ergonomically, because it has a well-established Option type. C++ now has one as well, std::optional, but it hasn't yet reached its map API, because it was only added very recently, in C++17.

And Option integrates even better than std::optional with the programming language, because Option is just a garden-variety sum type, a Rust enum, which lets you do things like if let Some(x) = ..., and combine testing and unpacking in the same statement. C++ could not design a map API this ergonomic, because they lack this fundamental feature.

Also, unlike with null in Java, if you want to use Option as a meaningful distinction in your map, you still can. The get function would then return Option<Option<...>> instead of just Option -- the outer one representing presence, the inner one representing whether the value was None or Some(...). Option is composable in a way that null is not.

For the record, the Rust equivalent to operator[] -- the Index trait implementation on maps -- does the equivalent to C++ at, and panics if the key isn't present. While not as generally useful as get, I think this is a reasonable interpretation of what map["foo"] should mean.

Mutation Station

So Rust wins, I'd say pretty handily, when comparing how to access a value from a map, how to query them. But where Rust truly shines is when mutating a map. For mutation, I'm going to approach the discussion differently. I'm going to start by specifying what use cases might exist, and then, in that context, we can discuss how an API might be built.

The mutation situation has a similar dilemma to querying: the key in question might or might not already be in the map. And, for example, we often want to change the value if the key is present, and insert a fresh value if the key is absent.

Of course, we could always check if the key is present first, and then do something different in these two scenarios. But that has the same problem we already discussed for querying: We then have to iterate the tree twice, or hash the key twice, or in general traverse the container twice:

auto it = map.find(key); // first traversal
if (it != map.end()) {
    return it->second;
} else {
    int res = load_from_file(key);
    map.insert(std::pair{key, res}); // second traversal
    return res;
}

So what should we do for our API for this scenario, where we want to change the value if the key is present, and insert a fresh value if the key is absent?

Well, sometimes that fresh value is a default value, like if we're counting and the key is the thing we're counting -- in that case, we can always insert 0. In that case, C++'s operator[] -- when combined with an appropriate default constructor -- can actually work well.

And sometimes, that fresh value depends on the key, like if the value is a more complicated record of many data points about the item in question. If the value is a sophisticated OOP-style "object," and the key indexes one of the fields also contained in the value, C++'s operator[] would not work. The default value is a function of the key.

And sometimes, there isn't a default value per se. Sometimes, if the key is absent, we need to do additional work to find out what value should be inserted. This is the case if the map is a cache of some database, accessed via IPC or file or even Internet. In that situation, we only want to send a query if the key is not present. We would not be able to accomplish our goals simply provide a default value when sending the mutation operation.

C++ doesn't have anything for us here. operator[] is pretty much its most sophisticated "query-and-mutate" operation. Java, somewhat surprisingly, does have something relevant, compute. This handles all of these situations, with a relatively unergonomic callback function -- and as long as your map never contains nulls.

Rust's solution, however, is to create a value that encapsulates being at a key in the map that might or might not have a value associated with it, a value of the Entry type.

As long as you have that value, the borrow checker prevents you from modifying the map and potentially invalidating it. And as long as you have it, you can query which situation you're in -- the missing key or the present key. You can update a present key. You can compute a default for the missing key, either by providing the value or providing a function to generate it. There are many options, and you can read all of them in the Entry documentation; the world is your oyster.

So the C++ code above can be ergonomically expressed as something like this in Rust:

let entry = map.entry(key.to_string());
*entry.or_insert_with(|| load_from_file(key))

And the idiom where we're counting something could be expressed something like:

map.entry(string)
    .and_modify(|v| *v += 1)
    .or_insert(1);

So we get this nice little program that counts how many times we use different command line arguments:

use std::collections::BTreeMap;
use std::env;

fn count_strings(strings: Vec<String>) -> BTreeMap<String, u32> {
    let mut map = BTreeMap::new();
    for string in strings {
        map.entry(string)
            .and_modify(|v| *v += 1)
            .or_insert(1);
    }
    map
}

fn main() {
    for (string, count) in count_strings(env::args().collect()) {
        println!("{string} shows up {count} times");
    }
}

Conclusion

So first off, Entrys are super nice, and neither Java nor C++ has anything anywhere near as nice. Even when it comes to just querying, Rust's get is much better than Java's get, and a little more ergonomic than C++'s find.

But this isn't an accident. This isn't just about Rust's map API having a nice touch. When we look at the definition of Entry, we see things that Java and C++ can't do:

pub enum Entry<'a, K, V> 
where
    K: 'a,
    V: 'a, 
 {
    Vacant(VacantEntry<'a, K, V>),
    Occupied(OccupiedEntry<'a, K, V>),
}

First, this is an enum: There's two options, and in both option, there's additional information. Of course, Java and C++ can express a dichotomy between two options, but it's a lot clumsier. Either you'd have to use a class hierarchy, or std::variant, or something else. In Rust, this is as easy as pie, and since it does it the easy way, you can not only use the various combinator methods in Rust, you can also use Entrys with a good old-fashioned match or if let to distinguish between the Vacant and Occupied situation.

Second, there's a little lifetime annotation there: 'a. This is an indication that while you have an Entry into a map, Rust won't let you change it. Now, in Java and C++, there's also iterators, which you may not change a map while you're holding, but in both those languages, you have to enforce that constraint yourself. In Rust, the compiler can enforce it for you, making Entrys impossible to use wrong in this way.

Without both of these features, Entry would not have been an obvious API to create. It would've been barely possible. But Rust's feature set encourages things like Entry, which is yet another reason to prefer Rust over C++ (and Java): Rust has enums (and lifetimes) and uses them to good effect.

Addendum

I wanted to address a few points that people have raised in comments since I posted this.

Some people have pointed out that C++ has insert_or_assign, but in spite of the promising name, it just unconditionally sets a key to be associated with a value, whether or not it previously was. This is not the same as behaving differently based on whether a value previously existed, and it is therefore not relevant to our discussion.

More interestingly, it has been pointed out to me that with the return value of insert, you can tell whether the insert actually inserted anything, and also get an iterator to the entry that existed before if it didn't. This allows implementing some, but not all, of the patterns of Entry without traversing the map twice.

For example, counting:

int main(int argc, char **argv) {
    std::vector<std::string> args{argv, argv + argc};
    std::map<std::string, int> counts;

    for (const auto &arg : args) {
        counts.insert(std::pair{arg, 0}).first->second += 1;
    }

    for (const auto &pair : counts) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    return 0;
}

This works, but is much less clear and ergonomic than the Entry-based API. But perhaps more importantly, this functionality is much more constrained than Entry, and is equivalent to using Entry with just or_insert, and never using any of the other methods. As another commentator pointed out, counting is possible with just or_insert:

*map.entry(key).or_insert(0) += 1

But counting is just one example. C++'s insert is still deeply limited. Using C++'s insert means you have to know a priori what value you would be inserting. You can't use it to notice that a key is missing and then go off and do other work to figure out what the value should be. So you can't do my load_from_file example.

In order to do the load_from_file example in C++, even with this use of insert, you would have to temporarily insert some sentinal value in the map -- and that goes against how strongly typed languages ought to work, in addition to breaking the C++ concept of exception safety.

This is, as was pointed out in another comment, exactly what C++ programmers sometimes have to do, to meet performance goals, at the expense of clarity and simplicity, and therefore, especially in C++, at the expense of confidence in safety and correctness.

Safety and Performance

There is a persistent and persnickety little argument that I want to talk specifically about. This argument is really persuasive on its face, and so I think it deserves some attention -- especially since I am guilty of having used this argument myself, many years ago when I still worked at an HFT firm, to claim that C++ had a niche that Rust wasn't ready for. I've also seen it a few times in a row in the wild, and it's made me so emotional that I simply had to write this, and as a result, it's a little more emotional than some of the other posts.

In this argument, array indexing stands in for a number of little features. But -- I've seen array indexing cited so often as a canonical example that I feel compelled to address it directly!

The argument goes like this:

In Rust, array accesses are checked. Every time you write arr[i], there is an extra prepended if i >= arr.len() { panic!(..) }. As you can see, that is more code, and worse, a run-time check. And while the optimizer might eliminate it, or the branch predictor may well predict it right every time, the extra code bloat and possible run-time check, is just unacceptable in [insert field here (I used HFT)], where every nanosecond matters. And until some acceptable solution is found to this, I just don't see Rust making it in [insert field].

When I made this argument, to a group of programming-language academics, the defenders of Rust countered with a number of points, all of which accepted the basic premise:

  • Do I really need those extra nanoseconds? Yes.
  • Is it really too much of a price to pay for all that extra safety? Yes.
  • Do I really distrust the optimizer that much? Yes. If only Rust had a way to do optimizer assertions, a way to statically verify that the panic had been optimized out.
  • Would dependent typing on integer values help? Yes. That sounds very promising. I think Rust will get there someday, but for right now we must use C++.

Now that I know more about Rust I'm happy to tell you that I was completely off base. I wasn't off base about the performance considerations, or the unacceptability of even the slightest risk of a run-time check. I was off base about an even more basic premise: that Rust uses checked array indexing, whereas C++ uses unchecked array indexing.

But wait! Isn't that the whole point? Doesn't C++ avoid checking everything, to make sure all abstractions are zero-cost, to be blazing fast? Doesn't Rust, while trying for performance, in the end always concede to the demands of safety?

Well, let's look at the APIs in question. C++ apologists are always saying to use the modern C++ features from C++11 and later, rather than the more C-like "old style" C++ features, so on the C++ side let's take a look at the documentation for std::array, introduced in C++11.

Here we see two indexing methods. The first one, at, is bounds checked and will throw an exception if the index is out of bounds, whereas the second one, operator[], is not, and will instead exhibit undefined behavior of a very difficult-to-debug nature. It looks like C++ actually believes in free choice here, leaving the choice of method up to the user. Not quite what we supposed, but the important part is that unchecked indexing is available, so so far the argument can still stand.

Now let's look at Rust. Rust arrays and vectors can also be used with methods from slice, as can slices, so the slice documentation is the best place to look. And looking there, we immediately see -- drum roll please -- 4 methods. We see get and get_mut, which are checked, and right underneath them, in alphabetical order, get_unchecked and get_unchecked_mut, which are not.

To review, where do Rust and C++, these programming languages with their vastly different philosophies, Rust for the cautious, C++ for the fast and bold, stand? In the exact same place. Both programming languages have both checked and unchecked indexing.

Let me say that again. This is the talking point form, what to say if you need something quick to say, if you're ever debating programming languages on a political-style talk show (or at a party or even a job interview):

In both Rust and C++, there is a method for checked array indexing, and a method for unchecked array indexing. The languages actually agree on this issue. They only disagree about which version gets to be spelled with brackets.

The difference is simply in the default, which one gets that old fashioned arr[index] syntax. And even that can be changed. Even if the C++ default were superior -- and, as I will argue later, it is not -- this is surely a minor issue. After all, don't we normally use our fancy for x in arr syntax in Rust? This issue is just so small as to be unlikely to be a deciding factor in what programming language is better, even if we're in a special application domain where every nanosecond matters.

The Unsafe Keyword

So that's a wrap folks. We can all go home, and none of us will ever see this extremely silly argument on the Internet or in person again. It's just a misunderstanding, the person making it was simply misinformed, and all it will take is a link to this blog post -- or the relevant method in the docs to set them straight.

But wait! The C++ apologists are still talking! What are they saying? How have they not been completely flummoxed? They're pointing at that method, chanting a word like a slogan at a protest march. I can't quite make it out -- what it is it?

Oh. They're chanting unsafe. And credit where credit is due: it's very difficult to chant in a monospace font.

Well, that is easy to respond with! The nerve, that C++ programmers would call our unchecked array indexing method unsafe. For one, all unchecked array indexing methods are unsafe: that's what unchecked means. If it were safe, it would be at least statically checked. For another, isn't this the pot calling the kettle black? Isn't C++ all about unsafety, so much that C++ programmers don't even mark their unsafe code regions becasue it all is, or their unsafe functions because they all are?

"But isn't that the whole point of Rust?" they cry. "If you have to use unsafe to write good Rust, then Rust isn't a safe language after all! It's a cute effort, but it's failing at its purpose! Might as well use C++ like a Real Programmer!"

This, my friends, is a straw man. No, the point of Rust and specifically Rust's memory safety features is not to create an entirely safe programming language that can't be circumvented in any circumstance; you must be thinking of Sing#, the programming language for Microsoft's defunct research OS.

Let me be abundantly clear: The point of memory safety, the unsafe keyword, and friends in Rust is not to completely enforce memory safety, to make it impossible for the programmer to do anything they want to with the computer, even if they can't prove to the compiler that it's OK. In fact, the point of memory safety isn't to make it impossible to do anything at all -- it's to make it possible to reason about the program.

The premise of Rust is that the vast majority of code in a systems program doesn't need to be unsafe, and so it might as well be safe. People used to believe that you needed garbage collection for safety, but Rust proved that you could use lifetimes to still get safety without that performance cost. Now that we're there, why worry about null pointers? Why not tell the compiler which things can be null, and which things can't, so the compiler can check for you whether you're handling nulls correctly? I've programmed C++ professionally for years without such a feature. You'd better believe I would have totally annotated the crap out of the code so the compiler could've caught them ahead of time.

Sometimes, C++ apologists cite valgrind. I've had codebases where I tried to use valgrind. Unfortunately, there was so much undefined behavior and memory leaks already caked into this project that new ones were simply impossible to see among all the noise. An army of junior engineers was at some point required to clean this up when finally the hierarcy decided that "valgrind" was something we might want to be able to use in the future.

And a lot of those undefined behaviors were ticking time bombs. Certainly, this codebase had its issues. A friend of mine took days to find a bug where a pointer had a value of 7. I don't mean 7 elements into some array, not 7 of the relatively wide pointer type, not a convenient, testable-for NULL, value. No, none of that: The pointer's value was exactly 0x7.

I've had memory corruption issues where I poured over every line of code that I wrote, over and over again, finding nothing. Ultimately, I learned that the issue was in framework code -- code written by my boss's boss. The code was untested, and written extremely poorly, and had rotted, so that it didn't work at all. In Rust, I might have had some idea that my code -- which in Rust would have all been able to be "safe" -- couldn't possibly be the source of the problem. Maybe my humble assumption that my code was to blame would be a little less tenable.

If I wanted a language that was always safe, at the time I knew Java or Python existed. Some companies even do finance in Java, for exactly that reason. But sometimes you still need that extra bit of performance. unsafe is sometimes necessary.

But given what gains safe Rust has made in predictable performance, it's not as necessary as it used to be. The majority of the code I wrote then could've been written in safe Rust, and not lost a single clock cycle. The parts that needed to be unsafe could have been isolated, delegated to specific sections, wrapped in abstract data types, perhaps entrusted to a specific team.

And even then, I'm sure we would have been debugging memory corruption issues. But we'd know where to look. We'd know where to throw the tests. And we'd have saved programmer-years of time, days if not months of my life.

Now, I'm proud of my C++ skills. There is some part of me that wishes that C++ was better than Rust, that all that time getting better at debugging memory corruption wasn't dedicated to a skill that is becoming obsolescent through better technology. And to be honest, that's part of why I dismissed Rust as a candidate for HFT programming languages.

But it's possible to be proud of a skill that is also becoming obsolete. And I am trying to replace it with a new skill to be proud of -- writing Rust as performant as idiomatic C++, or even more performant, while reaching for the unsafe keyword rarely and modularly. I think it's truly possible, for where it's relevant.

Now I must turn to a subset of C++ apologists, who write using "modern C++" which is "very safe now" and experience therefore no memory corruption issues. To them I say, you are not doing high performance programming. If you were, you'd have to do some wonky things with pointers to spell the bespoke high-performance constructs you'd need.

There is indeed a safe subset of C++ heavy with modern features. If you are disciplined and keep your programming in that realm, you can avoid memory corruption mostly. But first, this safe subset covers fewer high-performance features than Rust. I've read some of this code and its idioms: It's full of shared_ptrs not to share ownership but simply to avoid types that might be invalidated. It ironically leans on reference counting more than idiomatic Rust. This is among other, similar problems.

Let me be clear: First off, instead of keeping in your brain which features are "modern" and which are "edgy," why not have a distinction where it's well-marked? Second off, if you are writing entirely in this safe subset of C++, you can get much better performance instead out of the safe subset of Rust. You have no right to complain about Rust's safety trade-offs, as you're using a worse set, where you get no safety promises from the compiler and none of Rust's surprising safe performance.

Rust's safe and "slow" subset is faster than C++'s while still being, obviously, safer. Rust's unsafe subset is better factored and better distinguished. Comparing apples to apples, Rust is better programming language for extracting performance out of LLVM, because you'll be able to code more often without fear, and with very focussed fear when you do feel it.

A tool is even more useful if you can adjust it. The defenders of C++ talk about choosing trade-offs, but really, Rust offers both trade-offs. Mark your code as unsafe and convince yourself of its safety manually, or rely on programming language features. It's up to you, on a function-by-function, even block-by-block, basis. In C++, if you have a problem, every line of code is suspect; you simply can't opt in to safety, but in Rust, for where you don't need the performance of unchecked indexing and other unsafe features, you can relax about the possibility of going bankrupt due to inadvertent memory reinterpretation -- and how do I wish my NDA permitted me to talk about consequences at my own previous jobs!

And for where you do need to use unsafe, you can make sure your debugging and overthinking efforts are well-directed, for the few places in a large project you need it.

Unchecked Indices

This has gotten a little far from the original question. Should array indices be checked? Well, let me be clear about two facts that are both true, but in tension with each other:

  • Unchecked array indexing is sometimes absolutely necessary
  • Unchecked array indexing is an edge-case feature, which you normally don't want.

If unchecked array indexing was unavailable in Rust, that would be a bug. What is not a bug is making it inconvenient. C++ programmers probably should be using at instead of operator[] more often. But in C++, what would it gain? There's so many unsafe features, what's the cost of one more?

But in Rust, where so much code can be written that's completely safe, defaulting to the safe version makes more sense. Lack of safety is a cost too, and Rust makes that cost explicit. Isn't that the goal of C++, making costs explicit?

Let's look at situations where you are indexing memory. First off, most of them I saw were in old C-style for-loops, where you loop over an index rather than using iterators directly with a collection. Both Rust and C++ have safe versions of for that loop over collections with iterators, and those use the same check for the loop as they do for bounds, so those are easy enough to address. Nevertheless, I think that a lot of the noise about checked vs. unchecked array accesses comes from people who use indexing for their for-loops instead of iterators, and therefore mistakenly think that array indexing in general is a far more common operation than it is.

For the remaining situations, most are implementing either gnarly business logic, or a subtle, fast algorithm.

If it's gnarly business logic, in my experience, it's usually at config time -- along with a good third to half to even more of the code in a complicated production system.

What do I mean by config time? A running high-performance system, whether optimized for latency or throughput, has a bunch of data structures organized just so, a lot of threads set up just right to move data between them in the perfect rhythm, and a lot of the work is in arranging them. That work is generally not performance-sensitive, but often has to be in the same programming language as the performance-intensive stuff.

Config-time is, depending on how you look at it, less of a thing or the entire thing in a programming language like Python. Python basically exists to do config-time programming for performance-intensive code put in very comprehensive "libraries" written in C or C++. But in C++, where you have a constructor that runs only once or a few times at first, and other methods related to it, in the same programming language as the money-making do-it part, you have to really adjust programming style between them.

Config-time is obviously when you read the configuration files. It's where you open the relevant files. It's where you call socket and bind and listen on your listening port. It's where you spin up your worker threads, and make computations on how many worker threads there are. It's where you construct your objects and your object pools. It's where you memory map your log file. It's where you set your process priorities. It's where you recursively call the constructors and init functions of every object in your overwrought OOP hierarchy.

There is no need to sacrifice safety for performance at config time -- especially since undefined behavior might lie latent and destabilize the system once it's actually up and running. If you do an unchecked array access at config time, you might put garbage data in an important field, maybe one that determines how much money you're willing to risk that day or how many of a thing to buy. And for what? To save a few nanoseconds before your process has even "gone live"?

So, when do you truly need unchecked array accesses? If it's a subtle fast algorithm, probably deep in an inner loop, you should probably be wrapping it in an abstraction anyway. The code that actually executes the algorithm should be separate from the business logic, so that programmers trying to maintain the business logic don't accidentally break it. And that's exactly where it makes the most sense to use unsafe -- when implementing a special algorithm. Maybe the proof that the index is within bounds relies upon some number theory the compiler was never going to understand without its own proof engine: great! You should probably be explaining that in a comment in C++ anyway, and so the conventional comment that goes with the unsafe block in Rust is a perfect place to explain it.

But maybe I'm wrong about all of this. Maybe your experience hasn't matched mine. Maybe your particular application needs to make unchecked array accesses a lot, needs them to be unchecked, and needs them littered all over the codebase. I raise my eyebrows at you, suspect you need more iterators and perhaps other abstractions, and wonder what problem you're trying to solve. But even if you're absolutely right, I think it's still a better idea to write Rust littered with unsafe every time you index an array, than to write C++.

Because, as I keep emphasizing, Rust is still a better unsafe programming language than C++. It would be better than C++ even if safety weren't a feature.

Post-Script: Some Perspective for the New Rustacean

I understand where this straw man argument comes from. The word unsafe is scary, and advice, especially aimed at people coming from safe languages like Python and Javascript, is to avoid unsafe features while learning. And while I think adding unsafe to production code should only be done once you've exhausted safe possibilities -- which requires full understanding of safe possibilities -- this advice can feel overbearing for a transitioning C++ programmer, especially when it is immediately obvious that the safe features are very constrained and can't literally do everything.

For that good-faith recovering C++ programmer, new to Rust: You're right. The safe subset isn't enough to do everything you want to do. And when it doesn't, that doesn't mean it failed. Its goal is to make unsafe code rare, not non-existent. But it might surprise you how rarely you truly need unsafe. And a good resource for you might be, as it was for me, the excellent Learn Rust the Dangerous Way by Cliff L. Biffle.

For what it's worth, however, this criticism of Rust in general is often levelled either in bad faith, or from a misunderstanding of what the unsafe keyword is for. For all the philosophical discussion of what unsafe truly means -- and how it interacts with the surrounding module and encapsulation/privacy boundaries -- as well as principled conventions for using it, please see the Rustonomicon, the canonical book on unsafe Rust, the same way the book is canonical for introducing Rust.

Other criticisms of Rust from an HFT or low-latency point of view are more relevant. Most specifically, gcc and icc are much better compilers for those use cases -- empirically -- than is LLVM. Also, the large codebases existing in C++ are often tested and contain thousands upon thousands of programmer-years of optimizations and bugfixes, where even small compiler upgrades are scrutinized closely for performance regressions. Migrating to another programming language from that starting point would be prohibitively expensive.

None of which is to say that if Rust gradually replaced C++ altogether, eventually such ultra-optimizing compilers and ultra-optimized codebases wouldn't start appearing in Rust. I hope to see that day within my lifetime.

Common Complaints about Rust

Before I get into the specific topics, though, I'd like to clear up a few talking points I've seen in the discourse. Some of these seem a little silly to me, but they're not exaggerations.

Rust fans all want to rewrite everything in Rust immediately as a panacea.

Unfortunately, we have a vocal minority who do! But most Rust developers have a much more moderate perspective. Most are aware that a large project cannot and should not be rewritten lightly.

Rust is a fad

Rust might be popular right now, but it also is already a part of a lot of critical infrastructure, and is currently being put in the Linux kernel.

Rust fans are all young and naive people with little real-life programming experience

Sure, there's fans of all languages who are like that.

But there's also Bryan Cantrill embracing Rust after a lifetime of disliking C++. There's Linus Torvalds allowing it in the kernel, after his very vocal anti-C++ statements.

There's a long tradition of people who like C, but don't like C++. See the Frequently Questioned Answers for a taste. Their criticisms are, in many cases, legitimate. This is unfortunate, because C has such limited capacity for abstraction, and so they're missing out on all the abstractive power that a higher-level language can provide.

For some of these people, Rust addresses the most important criticisms.

Personally, I agreed with many of those criticisms, personally, but was a professional C++ programmer for 5 years, working in positions where the zero-cost abstractions were absolutely necessary. I wasn't in a position to choose programming languages at the company where I was, but I agreed with the choice of C++ over C, in spite of what in my mind were the clear costs of overcomplexity. I enjoy Rust now because it addresses many of those problems.

But you must at least admit that Rust fans are annoying.

Many of the ones that annoy you, annoy me too, if not more.

I also know that I've annoyed C++ fans by advocating for Rust on the C++ subreddit. My post was taken down by moderators as irrelevant to C++, but how could you be more relevant than a thorough critique?

Rust will be in the same place as C++ in 40 years Rust will end up as convoluted as C++ is now.

This might just be true, but I don't get why it's used as an anti-Rust argument. If there needs to be a new systems language in another 30 or 40 years that reboots Rust like Rust is rebooting C++, I don't see that as a failure. I certainly don't see that as a reason not to use Rust now. And when the new programming language comes around to out-Rust Rust, I'll advocate switching to that too.

That would just mean that programming languages are subject to entropy and obsolescence like everything else. And in that case, C++ will just continue to get worse in the meantime too, so Rust will be better than C++ the entire time. If all programming languages accrue cruft as they age, in what world is that a reason to use the cruftier programming language? Isn't that a reason to use the newest appropriate programming language?

Most Rustaceans are not, despite the stereotype, treating Rust as some apocalyptic, messianic programming language to end all programming languages. The goal isn't to have an eternally good programming languages; programming languages are tools. We switch to better tools when it is practical to do so. The question is: What should new projects be written in now? When a rewrite is called for (as it sometimes is), should it include a new programming language now that there is a viable alternative?

I suspect that many making this argument are including an unstated assumption -- that C++'s cruft is actually a sign of its maturity, and fitness for production use. Alternatively, and a little more charitably, they might assume that Rust isn't ready for production use yet, and by the time it is, it will be just as crufty as C++, perhaps converging to the same level of cruft. But while there are a few categories where Rust lags C++, they are mistaken in the big picture. For the vast majority of C++ projects, Rust is already a better option for if the project had to be rewritten from scratch (a big "if," but irrelevant to the merits of the programming languages).

But also: Maybe Rust will be able to avoid some of C++'s mistakes; it's certainly trying to.

No programming language is better than another; there's simply different tools for different jobs.

I have a hard time taking this line of argument very seriously, and yet it comes up a lot. There are some tools for which almost no job is the right job. There are some tools that are just worse than other tools. No one uses VHS tapes anymore; there's no job for which they're the right tool.

Programming languages are technology. Some technologies simply dominate others. There are currently still some things that the C++ ecosystem has that Rust doesn't yet: I'm thinking about the GUI library space, and gcc support. Also, C++ has undeniably better interoperability with C, which is relevant.

But those things might change. There is no natural reason why C++ and Rust would be on equal fitting, or why Rust wouldn't at some point in the future be better than C++ at literally every single thing besides support for legacy C++ codebases. Some tools are simply better than others. No one's writing new production code in COBOL anymore; it's a bad tool for a new project.

C++ undefined behavior is avoidable if you're actually good at C++/if you just try harder and learn the job skills. You just have to use established best practices and a lot of problems go away.

First off, my experience working at a low-latency C++ shop shows that that's not true. Avoiding undefined behavior in high-performance C++ is extremely hard. It's hard in Rust too, but at least Rust gives you tools to manage this risk explicitly. If you're avoiding memory corruption errors in C++, you've either found a safe subset, or you're coding easy problems, or likely both.

But even if there are use cases where this is true, to me that means that an experienced C++ programmer can be just as good at avoiding undefined behavior as a novice Rust programmer. So what does this mean for a business considering whether to use C++ or Rust? In C++ everything a junior programmer writes requires more scrutiny from senior programmers. Everyone requires more training and more time to do things correctly. It's not that good of a selling point.

Similarly, using best practices makes it sound easy, or at least achievable. But the more complicated and arcane best practices are, and the higher the stakes of following them, the higher the cognitive load on the programmers, and again, the more you need senior programmers to look over everyone else's work or even do parts of the work themselves.

When we've been doing something the hard way for a long time, and it's successful, it's tempting to see other people struggling and to tell them it's not that hard, that they can just up their knowledge and their work ethic and do it the hard way like us. But in the end, everyone benefits if the work is just easier with better tools.

And what's a better tool than a "best practice"? An error message. A lint. A programming language structured in such a way that it doesn't even come up.

Programming language is a matter of personal preference.

For your hobby project, sure, this is true. But there are real differences between programming languages in terms of many things that matter for business purposes.

The existence of unsafe defeats the purpose of Rust. You have to use unsafe, and since the standard library uses it, you're almost certainly using it too. That makes Rust unsafe just like C++ in practice, and so there's no advantage to switching.

I would call this a straw man, but people do call Rust a "safe" programming language, and some people say you should never have to use unsafe (which I disagree with). So this takes some addressing.

First of all, memory safety, while important, is not the only purpose of Rust. I would switch to Rust even if it didn't have the unsafe keyword. There are many other problems about C++, and this book focuses primarily on the other problems.

Second of all, in every "memory-safe" language, safe abstractions are built from unsafe foundations. You have to -- assembly language is unsafe. unsafe allows those foundations to be written in Rust. And that is what a memory safe language is, not one that is 100% memory safe in all situations, but one in which it's possible to explicitly manage and scope memory safety, and do most regular tasks in the safe subset.

You can't both have the guard rails that Rust provides and write certain types of high-performance code at the same time, but the unsafe keyword allows you to make the decision on whether to have your cake or eat it on a situation-by-situation basis, rather than giving up one or the other for the entire programming language.

If you don't use unsafe, and you trust the libraries you import, then you're in a safe language. If you do use unsafe, you are temporarily in a language as flexible as C++, while still having many advantages of Rust -- including safety features, which are still fully in place for most programming constructs even in unsafe blocks.

Use the unsafe code to build more safe components, and expand the safe language, and you get to only worry about safety a small percentage of the time, as opposed to all the time in C++.

Rust is taking the easy way out. Or: You can do C++ well, you just have to work harder at it, so there's no point to Rust.

I do think people regularly underestimate their ability to write safe C++. Other people underestimate how much performance they're giving up on by making sure they're confident their C++ is safe.

But even if you have put in the work to be good at writing C++ safely, why does that mean that someone else shouldn't be happy to get the same results with less training and less work, if the technology exists?

Who wouldn't want to take the easy way out? Do you exit your house by climbing through the windows? This phrase only makes sense when there's a downside, in which case the response depends on the alleged downside. In which case, the actual downside is more important than this rhetorical trick.

Because, after all, businesses should use programming languages that make programming easier.

Safety means that Rust is not as high-performance

It's true that some operations in Rust are checked by default which are unchecked by default in C++. Array indexing is the typical example.

However, both checked and unchecked indexing are available in both Rust and C++. The difference is in Rust, to use the unchecked one, you have to use a named method and an unsafe block. This is easy enough to do in situations where indexing matters.

Most code is not the tight loops in performance-sensitive parts of performance-sensitive code. Most code by volume is configuration and other situations where the check is well worth it to prevent the possibility of memory corruption. Rust does make a less performant default decision than C++, but it is not that hard to override, and then you still get all the other benefits of Rust.

All programming languages have foot-guns.

Some have more than others. In some you run across them more frequently than others. And in some, they come with a safety.

Modern C++ fixes the problems with C++.

Modern C++ has all the bad features of pre-modern C++. It has to, to be compatible with it.

In my experience, it's not enough to have good features. To make promises about memory safety, or even to have a sane programming ecosystem, it's also important to not have bad features. And redundant features of varying quality are often the worst of both worlds.

Response to Dr. Stroustrup's Memory Safety Comments

The NSA recently published a Cybersecurity Information Sheet about the importance of memory safety, where they recommended moving from memory-unsafe programming languages (like C and C++) to memory-safe ones (like Rust). Dr. Bjarne Stroustrup, the original creator of C++, has made some waves with his response.

To be honest, I was disappointed. As a current die-hard Rustacean and former die-hard C++ programmer, I have thought (and blogged) quite a bit about the topic of Rust vs C++. Unfortunately, I feel that in spite of the exhortation in his title to "think seriously about safety," Dr. Stroustrup was not in fact thinking seriously himself. Instead of engaging conceptually with the article, he seems to have reflexively thrown together some talking points -- some of them very stale -- not realizing that they mostly are not even relevant to the NSA's Cybersecurity Information Sheet, let alone a thoughtful rebuttal of it.

Fortunately, he does eventually discuss his own ideas of how to make C++ memory safe -- in the future. If these ideas are implemented well, it will make C++ a safe programming language as the NSA's Cybersecurity Information Sheet has defined it. But given that they are currently just proposals in an early stage, it's unfair of him to expect the NSA to mention them when advising people on what programming language to use. C++ has been an unsafe language for a long time. Maybe someday that will change, but we'll believe it when we actually see it.

But before I discuss that, I'd like to rebut and discuss my disappointment at the talking points he uses earlier in his response, because I think they unfairly frame the debate, shield C++ from legitimate and important criticism, and slander memory-safe programming languages and downplay memory safety as a concept, even though it's very important.

Multiple Types of Safety?

One of the most interesting and conceptually relevant points that Dr. Stroustrup harps on is that memory safety is not the only type of safety:

Also, as described, “safe” is limited to memory safety, leaving out on the order of a dozen other ways that a language could (and will) be used to violate some form of safety and security.

This might technically be true -- it's not entirly clear what other forms of "safety" he's talking about -- but it's misleading. Memory unsafety is not just one of a dozen equally important forms of "unsafety." Rather, memory unsafety is by far the biggest source of security vulnerabilities and instability in memory unsafe programming languages -- estimates as high as 70 percent in some contexts.

A 70% decrease in security vulnerabilities is worth committing significant resources towards. Memory safety on its own is worth writing a Cybersecurity Information Sheet about, and it is the area where C++ has the most serious deficits. Given that, this feels like a car manufacturer whose cars do not provide air bags responding to a government advisory not to buy the C++ cars by saying "What about other types of safety? By talking just about air bags, the government is clearly not thinking seriously about safety." Sure, there's other types of safety features besides air bags (or memory safety), but air bags are still important!

So, Dr. Stroustrup, what about memory safety in C++? Shouldn't C++ have memory safety? Are you saying it's not important, especially when all of these other programming languages have it?

Of course, he doesn't go into detail about other types of safety, which is telling. Of course, it's because C++ doesn't really have the advantage in any of them. For example, Rust also has a lot of mechanisms for thread safety and type safety, intimately connected with its memory safety mechanisms, and baked into the design of Rust in a way that would be next to impossible to retrofit into another programming language.

And, when you read later on about the "safety profiles" in the C++ Core Guidelines that he makes such a big deal about, most of the focus there is also about memory safety.

Petty Irrelevancies

Let's look at some of the other points he makes.

That specifically and explicitly excludes C and C++ as unsafe.

C++ does not enforce memory safety as a feature of the programming language. This may change in the future (as Dr. Stroustrup discusses), but is the current state of things. Dr. Stroustrup tries to downplay this, but is not convincing.

As is far too common, it lumps C and C++ into the single category C/C++, ignoring 30+ years of progress.

Writing "C/C++" to mean "C and C++" is considered a faux pas among C++ programmers, and among C programmers as well, because it is seen as asserting that these two programming languages are near-identical when there are in fact major differences between them. By pointing out that the NSA does this, Dr. Stroustrup is trying to make them look like they don't know what they're talking about, just because they used a "/" character instead of the word "and."

He's reading too much into the orthography and the NSA's failure to use insider shibboleths of the programming languages they're trying to criticize. Outside of the "C" and "C++" communities, "C/C++" is a fairly common way to refer to the two related programming languages.

And that's the most relevant thing here: C and C++ are indeed related programming languages, and they have a lot in common: They are both compiled programming languages with a focus on performance, and they are (very relevantly) both not particularly focused on guaranteeing memory safety. C and C++ have a substantial common subset, with many memory unsafe features that are popular with programmers, perhaps even more popular because they work similarly in both programming languages. For the purposes of this document, it's often the features that C and C++ have in common that are the problematic ones, so it makes sense for the NSA to lump them together.

While there might be 30+ years of divergence between C and C++, none of C++'s so-called "progress" involved removing memory-unsafe C features from C++, many of which are still in common use, and many of which still make memory safety in C++ near intractible. Sure, new features in C++ have been added that (in some but by no means all cases) do not make it as easy to corrupt memory, but the bad old features are not in any real way being phased out: They are not guarded by any special opt-in syntax, nor in many cases do they result in warnings. Given that, the combined set of features is as strong as its weakest link.

Unfortunately, much C++ use is also stuck in the distant past, ignoring improvements, including ways of dramatically improving safety.

This is a common C++ talking point, but it doesn't help Dr. Stroustrup's position as much as he thinks it does.

He's trying to talk up how much C++ has improved, especially in the last 11 years -- and it has indeed improved. New ways of writing C++, emphasizing relatively new features, can indeed result in more reliable C++ code with less memory corruption.

But unfortunately, this talking point just serves to remind us that these old memory-unsafe features are still in common use. When someone says their project is written in Rust, we can guess that it likely uses only the safe features (including using standard library functions that use unsafe internally -- that truly doesn't count as unsafe), or maybe uses the unsafe features when absolutely necessary. But when someone says their project is written in C++, by Dr. Stroustrup's own admission, there's a high likelihood that it uses old features "stuck in the distant past, ignoring ... ways of dramatically improving safety." This is also a reason to avoid C++.

However, I would also contest his claim about these new features. Memory safety isn't just an absence of memory corruption, but a reliable method for ensuring the absence of memory corruption. "Using new features" isn't good enough. Even if using the new features in preference to the old ones were a guarantee of memory safety -- which it isn't, they're less memory corrupting but not truly memory safe -- the presence of the old ones would still cause problems. You would need some mechanism to ensure that the new features were only used safely, and that the old features were not used, and no such mechanism exists, at least not in the programming language itself. Someone who remembers the old features can always still slip up and use one by accident.

Static Analysis: Not Good Enough

Dr. Stroustrup points out that he's been working very hard on improving memory safety in C++, for a very long time:

After all, I have worked for decades to make it possible to write better, safer, and more efficient C++. In particular, the work on the C++ Core Guidelines specifically aims at delivering statically guaranteed type-safe and resource-safe C++ for people who need that without disrupting code bases that can manage without such strong guarantees or introducing additional tool chains.

Unfortunately, it's not done. The key word here is, of course, "aims." The next sentences admit that this feature is not in fact available:

For example, the Microsoft Visual Studio analyzer and its memory-safety profile deliver much of the CG support today and any good static analyzer (e.g., Clang tidy, that has some CG support) could be made to completely deliver those guarantees....

For memory safety, "much of" is not really good enough, and "could be made" is practically worthless. Fundamentally, the point is that memory safety in C++ is a project being actively worked on, and close to existing. Meanwhile, Rust (and Swift, C#, Java, and others) already implements memory safety.

It's worse than that, though. What Dr. Stroustrup is trying to downplay is that this involves using static analyzers, considered separate from the programming language, something the NSA's original article also discusses. Theoretically, if a static analyzer could be used to guarantee memory safety, that could be just as reliable as a programming language that does it. An engineering team could have a policy that all code must pass this static analysis before being put into production.

But unfortunately, human nature is more fickle than that. If it's not built into the programming language, it's going to get skipped. If a vendor says their software is written in C++, or if an engineer takes a job in C++, how will they know that these static analyzers will in fact be used? A programming language that takes memory safety seriously doesn't provide it as an optional add-on that most people will simply ignore.

But All The C++ Code!

The end of the last quote provides a common talking point in Rust vs C++ arguments:

[Static analyzers] could be made to completely deliver those guarantees at a fraction of the cost of a change to a variety of novel “safe” languages.

Besides the laughably condescending matter of calling Java (which first appeared in 1995), C# (first appeared in 2000), and Ruby (first appeared in 1995) "novel," this is a jab at a common trope that (some immature) Rust programmers go around demanding that people rewrite their projects in Rust (please don't do this!), and an attack on the idea that all code can be written in safe programming languages, given the large body of existing work in unsafe programming languages.

This is a bit of a straw man in this context. The NSA article that Stroustrup is responding to addresses that switching existing codebases might be expensive, even prohibitively so, saying:

It is not trivial to shift a mature software development infrastructure from one computer language to another. Skilled programmers need to be trained in a new language and there is an efficiency hit when using a new language. Programmers must endure a learning curve and work their way through any “newbie” mistakes. While another approach is to hire programmers skilled in a memory safe language, they too will have their own learning curve for understanding the existing code base and the domain in which the software will function.

It then follows this up immediately with an explanation of how tools like static analyzers can be used as a back-up plan for improving memory safety in memory unsafe programming languages -- exactly what Dr. Stroustrup discusses. He's criticizing this NSA document, implying it is not thinking "seriously," while fundamentally making a point that they already made for him.

Of course, this is a terrible endorsement of C++. It's far from ideal to have to use add-on tools to work around a language's flaws. Coming from Dr. Stroustrup, it reads more like a brag that his programming language has locked everyone in than a defense of why C++ is good. Or else, it's an admission that other programming languages should be used for new projects, and that C++'s fate is now to gradually fade like the elves from Middle Earth.

But he's also overstating his case. As I mention before, safe programming languages have existed for a long time. Many programming projects that in the early 90's would have been done in C or C++ have in fact been done in safe programming languages instead, and according to the NSA's recommendation, that was a good idea. As computers have gotten faster and programming language technology has improved, there has been fewer and fewer reasons to settle for languages like C or C++ that don't have memory safety as a feature.

When I was a professional C++ programmer as early as 2013, some people -- even some programmers -- already thought that C++ was a legacy programming language like COBOL or Fortran. And outside of narrow niches like systems programming (e.g. web browsers, operating systems, and lower-level libraries), video games, or high performance programming, it kind of has become one. The former application niches of C++ have been taken over by Java and C#, or more recently by Go. If you have an application program written in C++, chances are that it's a relatively old codebase, or written at a shop that has reasons to write a lot of C++ (such as a high-frequency trading firm).

Now, even C++'s systems niche is under threat, with Rust, a powerful memory-safe programming language that avoids many of C++'s problems. Now, even the niches where C++ isn't at all "legacy" have a viable, memory-safe alternative without a lot of the technical debt that C++ has. Rust is even allowed in the Linux kernel, a project that has only previously accepted C, and whose chief maintainer has always explicitly hated C++.

A Memory-Safe C++

Fortunately, after all of these ill-thought out, tired talking points, Dr. Stroustrup subtly changes his perspective. After his distractions, after bashing memory safe programming languages as "novel," bragging about how C++ is too entrenched to be removable, pretending memory safety is just one of many equally important safety issues, and promising optional add-on tools that will eventually be standardized, he finally begins to tackle the question of how C++ could be made memory safe, in an opt-in fashion:

There is not just one definition of “safety”, and we can achieve a variety of kinds of safety through a combination of programming styles, support libraries, and enforcement through static analysis. P2410r0 gives a brief summary of the approach. I envision compiler options and code annotations for requesting rules to be enforced. The most obvious would be to request guaranteed full type-and-resource safety. P2687R0 is a start on how the standard can support this, R1 will be more specific. Naturally, comments and suggestions are most welcome.

...

For example, in application domains where performance is the main concern, the P2687R0 approach lets you apply the safety guarantees only where required and use your favorite tuning techniques where needed. Partial adoption of some of the rules (e.g., rules for range checking and initialization) is likely to be important. Gradual adoption of safety rules and adoption of differing safety rules will be important. If for no other reason than the billions of lines of C++ code will not magically disappear, and even “safe” code (in any language) will have to call traditional C or C++ code or be called by traditional code that does not offer specific safety guarantees.

This is a lot closer to what the NSA document actually specifies for memory safe programming languages than he gives the document credit for. For example, the document already provides for opting out of memory safety via annotation, paired with an observation that that will focus scrutiny on the code that opts out.

Dr. Stroustrup did not need to criticize the document for not thinking "seriously" to reach this conclusion, but simply acknowledge that it's true that C++ is not a memory safe programming language yet, but that based on his work, it might soon become one. Maybe the next version of the NSA document will endorse using C++, but only if it's C++ZZ -- where ZZ is some future version of the C++ standard.

I'm glad comments and suggestions are welcome, however, because I have a huge one.

Opt-in for memory safety is unacceptable, and is almost as bad as having a separate static analysis tool to enforce safety. Opt-out is fine -- Rust has a way to opt out of memory safety with the unsafe keyword, and this concept is discussed and defended in the NSA's original document. But the default should be to enforce memory safety unless otherwise specified.

For C++, this means that if these safety features are added in C++ZZ, --std=c++ZZ should cause unsafe constructs to be rejected -- and the C++ standard should require that these constructs be rejected for an implementation to be a conforming implementation of C++ZZ. Perhaps (but only perhaps) other command line arguments could be added to override this constraint on a file-by-file basis. Ideally, a new compiler command (e.g. g++ZZ) should be created for each implementation that defaults to this stricter behavior.

Parts of the codebase that use legacy features should have to have at least a file-level annotation that that file is a legacy file -- and then this annotation could gradually be moved to the function level. As a side benefit, this could also be used to phase out and deprecate weird points of C++ syntax, similar to the Rust edition system: Anyone using, for example, 0 literals to mean nullptr would have to declare some sort of a legacy annotation on their file or in their build system.

Only with this sort of opt-out memory-safety system would I consider C++ a memory safe programming language. I'd be very happy to see a memory-safe C++. I earnestly hope Dr. Stroustrup is successful in his endeavors. I'm not holding my breath, though, and in the meantime, I will continue to use other programming languages, that are already memory-safe, for my new projects, as will the majority of programmers.

In the meantime, it is unfair for Dr. Stroustrup to call safe programming languages novelties or to pretend that C++ isn't already far behind the times on this. This was already an important criticism of C++ decades ago, when Java first came out in the 90's and was referred to as a "managed programming language." This was discussed in detail in my classes when I was a college student in the late aughts. To read Dr. Stroustrup's writing, C++ is being criticized by "novel" upstarts when it is well on its way to getting the feature, but in actuality, the time to act was 1996.