Rust is a polarizing programming language, because of how radical it is. It has gone the furthest in introducing features from functional programming languages into the mainstream world, and ignoring long-held programming language design principles from the realm of object-oriented programming. Its fans can be very enthusiastic, sometimes off-puttingly so, stereotypically demanding that all software be rewritten in Rust even when completely unfeasible -- a stereotype that is mostly untrue, but whose existence and occasional true examples shows the intensity of the debate. But a lot of Rust's criticism comes specifically from C++ programmers, and correspondingly a lot of Rustaceans' criticisms of other programming languages is directed specifically at C++ (including of course in this book). Even the creator of C++, while not mentioning it by name, entered the fray (and along with other Rustaceans, I responded, and my response is included in the appendices).
There's a good reason for this particular rivalry. While usable in other domains, Rust is strongest where C++ has hitherto been unopposed: as a high-level systems programming language. Many of Rust's greatest strengths are directly based off of ideas originated in C++, such as RAII. And Rust has, in many ways, the same goals that C++ has.
Specifically, in this book I shall argue that Rust has the exact same overall goal that C++ does, albeit with a different interpretation of how that goal is best accomplished. I will further argue that Rust does a better job of accomplishing these goals. Thus, the thesis of this book is a slightly longer version of the title:
Rust is a better C++ than C++, as it is better at C++'s own goals.
Zero-Overhead Abstractions
C++ has an explicit goal of providing zero-cost abstractions.
This is a bit of a confusing term of art and has the potential to be misleading, but it comes attached with explanations that clarify it some. It is also referred to as the "zero-overhead principle," which Dr. Bjarne Stroustrup, father of C++, explains (see pg. 4) describes as containing two components:
- What you don’t use, you don’t pay for (and Dr. Stroustrup means "paying" in the sense of performance costs, e.g. in higher latency, slower throughput, or higher memory usage)
- What you do use, you couldn’t hand code any better
There is also an executive summary of the concept at CppReference.com.
A clearer term that is occasionally used in the trenches in the C++ community is "zero-overhead abstraction" -- there is zero overhead, defined as cost in addition to what a reasonably-well hand-coded implementation would do. Using this term, a third principle becomes clearer, which was hidden all along, unstated among those other two principles, and against which those other two principles are balanced. The word "abstraction" is the key, and the third principle is:
- You can still get the abstractive and expressive power you expect from a modern programming language.
This third principle is necessary to distinguish higher-level "zero cost" languages like C++ and Rust from lower-cost languages like C.
To fully explain why I include this third principle, and to delve into the history of the concept in general, I want to talk more about C.
C: The Portable Assembly
C has often been described as a "portable assembly language." Unlike other high level programming languages before it ("high level" at the time meaning anything higher level than raw assembly language), it exposed users directly to gnarly machine-language abstractions like pointers, and to common assembly-language capabilities like shifting and bitwise operators.
The goal was to give the programmer something minimally distinct from assembly language, where the programmer had almost as much control over the computer as an assembly language programmer without sacrificing portability. Few higher-level features have been added, even now: there was no built-in string type, and only a limited array type that exposed the underlying concept of pointers the instant you poked at it. Structures are little more than a way of calculating offsets, and memory management is done by explicitly invoking memory management routines.
C's preference, in general, was to only add onto assembly those features absolutely necessary for portability, and not to impose any other structure on the programmer -- or, said another way, not to provide any other structure to the programmer.
This was far from an iron-clad rule. And there are definitely exceptions: C, built into the programming language, prefers null-terminated strings (also known as "C strings") to arrangements that use specific lengths, a substantial constraint on the programmer beyond assembly language and probably a mistake overall.
More deeply, and probably less avoidably at the time, C assumes a traditional call structure. Many techniques that can be used to implement closures, co-routines, or other more radical alternatives to a call stack are difficult to impossible to do with standard C -- while generally being possible in any assembly language.
But, with these exceptions, C generally does tend to only provide one overarching abstraction, portability, and when it does, it has the same zero-cost goals that C++ has, to only make the user pay for the abstractions they actually use, and to provide abstractions as efficiently as the equivalent hand-coded assembly.
Put another way, C++'s zero-cost overhead principle, as Dr. Stroustrup defines it, is more or less inherited from C. Where C++ differs from C is in the "abstraction" part of providing "zero-cost abstractions." Everything you can do in C++ you can do in (potentially tedious and repetitive and error-prone) C, but C++ provides more abstractions, beyond just what is necessary for portability.
And C does a great job of portability! But missing from that goal is anything about a generally usable set of abstractions. Some abstraction over assembly language is necessary to ensure portability, but C doesn't really go beyond that. It provides a standard library, but again, that's for portability purposes. The C standard library is not much more powerful than an assembly-language operating system API, just more standardized. C is portable assembly, not abstracted assembly.
C++: A More Abstracted C
C++ goes beyond that. C++ tries to be competitive. Before, we dissected "zero-overhead abstractions" into three goals:
- What you don’t use, you don’t pay for
- What you do use, you couldn’t hand code any better
- We give you the power of abstraction expected for a programming language of the day
But really, they are one goal. The essential goal of C++ over C is this:
- Create a competitive set of abstractions that cost no more than manual implementation in C or assembly, whether in terms of speed or memory or any other resource.
This includes the entire principle of zero-overhead abstraction. I include the word "competitive," to describe how C++ in fact approaches what abstractions it offers, and how C does not: In competition with other programming languages, which directly implies our third point above, about abstractions expected for a modern programming language.
This general single goal also implies the second point, that what you don't use you shouldn't pay for. If a feature is not used, a C programmer would simply not implement it and pay no cost for it. So if the C++ programmer pays any cost for it at all when they don't use it, that cost is all overhead.
Now we have C++ down to one coherent, easily-stated goal. And once we understand this, everything else about C++ makes sense.
C++ was originally christened "C with Classes," and it tried to add
Object-Oriented Programming to C. All the mechanisms of OOP could be
portably added to C directly by an application or library developer
with judicious use of function pointers and structure nesting (and
glib
is a famous example of a
library that does exactly that), but C++ built this abstraction
into the programming language itself.
Objective-C also did this (and according to Wikipedia it "first appeared"
one year sooner in 1984), but Objective-C has always felt like two
programming languages glued together. In Objective-C, the object-oriented
features do not inherit the zero-overhead principle from C -- nor do they
look like C at all. They look instead like a Smalltalk dialect, where
switching between C and this odd Smalltalk dialect was permitted on an
expression-by-expression basis using an odd mix of square brackets and
@
-signs.
In C++, the added abstractions, including OOP, take on more of a resemblance to C, and importantly, continue to try to retain C's advantages in systems programming by making the new features zero-overhead.
During much of the history of C++, OOP was considered to be the most important abstraction that a programming language could offer. But once it was added, it expanded the scope of C++ abstractions. Nowadays, C++ is considered multi-paradigm, and provides not just OOP, but a wide array of abstraction.
The only features C++ rejects out of hand are those that do not jive
with zero-cost abstraction. This is why garbage-collection is not
offered in C++ (though it is still possible to implement manually) --
it cannot be offered in a zero-cost way. However, C++'s alternative to
garbage collection, namely RAII, continues to become
more effective as new features like move semantics and std::unique_ptr
were added, to the extent that in modern C++, it would be unimaginable
not to have those features, and they have become essential to C++'s
memory management model.
This also explains why C++ keeps accruing new features -- to have a competitive set of abstractions -- whereas C maintains the features it has -- because the only thing it needs abstractions for is portability. This explains why C++ had to add templates -- as a zero-cost alternative to OOP, or a zero-cost way of implementing collections. They explain why C++ had to add move semantics -- because without it, RAII is a worse abstraction than GC.
These are great features that C++ has had to develop to achieve these goals.
RAII is a genius solution to the problem that garbage collection is not zero-overhead. With RAII, the source code looks simpler than in C, and is harder to get wrong. It's not quite as straight-forward as in a garbage-collected language, but it has many of the benefits of abstraction: You can't just forget to call the destructor.
But! From the compiled binary, modulo such nit-picky accidents as symbol names, you wouldn't be able to tell it was written in C++ instead of C! The abstraction was resolved and disappeared at compile-time.
Similarly with templates, and therefore with the collections in the STL. Similarly with many other C++ features.
And Rust has greatly benefitted from all of this innovation in C++. C++ is one of the giants on whose shoulders Rust stands. Rust inherits RAII from C++. Rust inherits the core idea of templates as well, though it puts some constraints on them and calls them "monomorphization."
And Rust needs these, because Rust is also striving to be the kind of programming language where the compiled binary looks like something someone could have written in C, but where the programmer actually had a much easier task. And I will argue that Rust does a better job.
Wrinkles: OOP and Safety
There's a few wrinkles in this though, a few featuers in either programming language that seem to detract from this goal, to undermine the idea that this is truly a focus for them at all. On the C++ side, we have OOP and virtual methods, which are often less performant than the equivalent hand-written C code would have been. On the Rust side, we have safety: array indexes, by default, panic when the index is out of bounds. What is "safety," and can Rust really be said to be interested in minimizing the cost of abstractions if it's also trying to achieve safety?
I know these seem like wildly unrelated issues, but they're actually connected. Both C++ and Rust are trying to have their cake and eat it too. They're trying to provide all of the conveniences of a modern high-level programming language, while outputting binaries equivalent to those a C programmer would make.
But safety and OOP actually do have non-zero overhead. OOP has virtual functions, preventing inlining and other optimizations and requiring indirect function calls. Safety, for its part, requires bounds checking, an obvious non-zero overhead. And in Rust, it constrains heap usage to certain layouts that are often less efficient than the ideal layout would be.
So why do C++ and Rust make this decision? And how do they justify it?
For most of C++'s history, OOP was an essential convenience of a high-level programming language. Everyone's buzzwordy design patterns were conceptualized in OOP. Without OOP, a programming language could not at all be taken seriously at the time, as it was widely believed that scalable software architecture and intuitive reasoning about code required OOP.
And throughout much of this time as well, C programmers were creating
similar code to this, so it was actually like code a C programmer would
write by hand! It was very popular to write structs of function
pointers, or other complex mechanism to allow "object oriented design"
into C programs. This is the entire premise of GObject
, now part of
glib
and used by gtk
.
Nevertheless, OOP has run-time costs. Run-time polymorphism (known in C++ as "virtual functions") is one of the pillars of OOP, the basis of software decision-making, and a powerful abstraction. In most OOP programming languages, this meant (and still means) there is a (non-zero cost) run-time decision made at every method call.
C++ maintains its "zero-overhead abstraction" principle here on a
technicality: by making it optional. Rather than not including the
feature, C++ makes you only pay for it if you use it. C++ was considered
to be low-level by making virtual
an opt-in keyword, rather than
the default. As we have said, not having it at all would've utterly
disqualified C++ as an application programming language. It was considered
necessary for usable abstractions.
After all, in programs where C++ programmers gets to just write virtual
,
C programmers were implementing all of OOP, including runtime polymorphism,
by hand.
Rust, however, starts over trying to accomplish this goal from a much more recent time. Now, OOP is no longer a sine qua non of programming languages -- in fact, it's become old-fashioned. Rust uses features from the functional programming paradigm instead to manage abstractions and give users a chance at wrangling a large codebase. In doing so, Rust has largely moved beyond the need for OOP.
However, just like OOP was considered essential to be taken seriously when C++ was in its heyday, nowadays, new programming languages are expected to be memory safe. Memory safety is no longer a feature for "novel" programming languages to experiment with, as Dr. Stroustrup, father of C++, unfortunately still thinks of it.
Memory safety comes with performance costs. Until Rust came along, it was widely assumed that it required garbage collection, an unacceptable cost, one that is impossible to opt out of and still use a heap, one that generally is paid not as-used but on a per-program basis. Rust, however, has managed to figure out a way to extend C++'s RAII and move semantics with the borrow checker and lifetimes (from Cyclone's "regions"), and make manual memory management with RAII safe.
There are still performance costs, but they are paid on an as-needed
basis, and opt-out is still available.Unlike a Java or a Haskell, the
goal isn't so much being a memory-safe programming language as having
a memory safe subset and encouraging memory-safe abstractions around
unsafe code. Similarly to how C++ is distinct from other OOP languages
by making virtual functions optional, Rust is distinct by having unsafe
and raw pointers at all. If there was no unsafe
keyword to guard these
features and protect programmers from using them by accident, it would've
disqualified Rust, in 2015, from being an application programming language
at all.
Protections for safety are now considered necessary for usable abstractions. I would say that C++ gets away with it because of its venerable, established position, but in fact it does not get away with it. No one uses C++ for new application-level programming, but rather only for systems programming where the performance is absolutely necessary.
This is because of safety. Many C++ programmers, including Dr. Stroustrup himself, don't realize that the rest of the world has already moved on to safe programming languages and it's no longer considered a novelty. Often, they think systems programming is the world rather than just a niche. But make no mistake, now that Rust has brought safe programming to this niche, it is only a matter of time.
But it still satisfies the zero-abstraction principle, even though there
is overhead. The overhead only applies when the feature is being used.
If there is a line of code where the overhead is unacceptable, the
solution is simple: use the unsafe
keyword, and use a non-bounds
checked method (or an alternative data structure with raw pointers).
It is advised that you wrap this unsafe code in a safe abstraction, but
that is just advice. In the end, Rust only makes you pay for abstractions
you're actually using, just like C++.
Should I use C++ or Rust?
So C++ and Rust both share the same essential goals:
- The C goal: Be portable while exposing the full power of assembly language.
- The C++ goal, which implies the C goal: Have modern, high-level programming language features while still outputting code as good as what an assembly language programmer would write.
So if these goals appeal to you as a software developer, which programming language should you use, Rust or C++?
In my mind, that depends then either on the non-essential goals, or else just the accidents of history and ecosystem.
If you have a large codebase already in C++ for example, then that might mitigate against switching. We can attribute this to C++'s goal of being compatible with previous versions of C++, a goal C++ has paid much for. Similarly if there's a C++ library that turns out to be the perfect match, that may make your decision for you.
But to be clear, it's similar in the other direction if you already have a large Rust codebase -- there's just fewer people in that position. This will probably change over time, though. I think Rust's ecosystem is already competitive with C++.
I think discussing the goals is more interesting, however, especially in the long term.
If you need object-oriented programming, which is another goal of C++, then C++ might be your thing. I generally think object-oriented programming is overrated and Rust's way of handling abstraction to be both more powerful and less prone to problems, but many people disagree with me. And I must admit: the big use case everyone always mentions for OOP is GUI programming, and Rust's ecosystem is particularly behind in the GUI space.
However, if you're worried about memory corruption and the related security vulnerabilities, it might be nice to have a guarantee that only certain lines of code can cause such problems. It might be nice to have all those lines marked with a special keyword and conventionally scrutinized and abstracted in such a way as to help prevent these conditions.
And Rust's safety advantages go beyond simply delineating which features are safe and which are unsafe. Rust is able to accomplish much more in its safe subset without performance degradation than the average C++ programmer might guess, because it has a more sophisticated type system.
For a system's programming language, I think memory safety is more important than object-oriented programming and better GUI frameworks (for now). GUI apps, a long time ago, used to be written locally in C or C++ to run directly on the user's computer. Nowadays, they are more likely to be deployed over the web and written in Javascript, and even those apps that do run directly on the user's computer tend to be written in other, less systems-oriented programming languages. If you're writing a GUI app, the choice for the part of the app that the user interacts with isn't between Rust and C++; it's between Rust, C++, C#, Java, JavaScript, and many others. Neither Rust or C++ stand much of a chance in the GUI space long-term.
And over time, I suspect we'll find out that OOP isn't as
necessary to GUI frameworks as we had thought. My favorite GUI
framework personally is in Haskell
and doesn't use OOP at all. And once that happens, I think OOP
will simply be another legacy feature, as Rust's trait
mechanism
is superior to OOP for non-GUI contexts.
Memory safety, on the other hand, is key for systems programs. Servers, OS kernels, real-time systems, automobile controllers, cryptocurrency wallets -- the domains where systems programming tends to be used are also domains in which security vulnerabilities are absolutely unacceptable. The fact that C++ doesn't have a safe subset, and makes it so difficult to reason about undefined behavior compared to idiomatic Rust, is a serious problem.
But even if Rust didn't have a specific memory safety advantage over C++, it would still have quite a few things going for it. It avoids header files and all the concomittant confusion. It gets rid of null pointers in most contexts, called "the billion dollar mistake" by the inventor of null pointers himself. It tidies up compile-time polymorphism and brings it in line with run-time polymorphism. It has actual destructive moves.
In general, Rust takes advantage of the fact that it doesn't have to also be C and also be old versions of C++, and uses it to create a much cleaner experience.
Rust Deficits
Rust has a few other specific downsides compared to C++.
Interfacing with C is an important goal for reasons besides backwards-compatibility. On many platforms, C serves as a lowest-common-denominator programming language, and its ABI serves as an inter-language protocol. C++ does provide smoother interfacing with this protocol than Rust does.
Relatedly, C++ generally has a relatively stable ABI on a given platform for a given compiler vendor. This allows dynamic libraries to be used as plugins with minimal glue code, something that in Rust normally requires awkwardly working through a C ABI interface. Personally, I think machine-language plugins as dynamically loaded libraries are mostly a relic of past software distribution models, and haven't seen many situations where they make sense, but I could think of a few edge cases.
In both of these cases, Rust is clumsier, but not completely incapable. Rust still can speak the protocol that is the C ABI, just not as natively and smoothly-integrated as C++.
Other downsides of Rust have to do with network effects and Rust adoption. There is only one Rust compiler, while there are multiple C++ compilers, that work together through a standards process. GCC is currently in the process of getting Rust support, and we'll see how well that works out for Rust.
Similarly, there are a lot of libraries that exist in C++ that don't yet exist in Rust or have Rust bindings. Though that's true of any pair of programming languages, it is a specific reason some developers might still want to write new projects in C++ in favor of Rust.
Finally, while I still think Rust would be a better programming language
than C++ even if unsafe code were allowed everywhere, I think Rust
could do more to make its rules clearer in the unsafe realm. The fact
that the latest research on Rust's memory models seems so deeply
difficult to square with how async
code often works as in this
bug report
makes me nervous.
I'm sure there are other ways in which Rust is behind C++, and the devil is as always in the details.
This Book
But enough about the exceptions! Every thesis worth writing a book about has caveats! Let's get back to the thesis of this book.
The groundwork for the thesis of this book is laid out above in this introduction. It is as follows:
Rust accomplishes the essential goals of C++ and keeps the good ideas, while eliminating the cruft, which has far-reaching benefits over remaining compatible with C++.
This book is all about the details and specific examples of this thesis. It covers a number of topics, from the detals of how it expands RAII to make it safe, how it makes move semantics less confusing, all the way to clarifications on what safety means in practice.
It originated as a collection of blog posts on my blog, The Coded Message, and it's an on-going project. New sections will continue to also be posted as blog posts, though revisions to the existing sections will take place sporadically with little fanfare. Please make issues and MRs on the git repo if you see any mistakes, or would otherwise like to contribute thoughts, criticisms, or additional content.
This book is more a persuasive document than an instructional document. It's a work of apologetics, explaining in detail, topic by topic, why Rust is good at these goals it shares with C++, and why a new programming language was necessary to achieve them more effectively.
Like most books of apologetics, it's nominally aimed at the skeptics, in this case the C++ developers who don't like Rust. But only nominally. It will be far more interesting for the seekers and the proselytes: Those who are interested in looking into Rust, or who have started using Rust, but aren't fully sure of its benefits over C++, or whether it can be truly used in as many ways as C++ can, or whether it can truly be as high-performance.
We've known for some time that C++ was hampered by its C legacy. Today, modern C++ is also hampered by the legacy of pre-modern C++.
Bjarne Stroustrup once famously said:
Within C++, there is a much smaller and cleaner language struggling to get out.
I'm sure Bjarne Stroustrup has already said that this quote was not about Rust, just as a long time ago he said that it wasn't about C# or Java. But I think it resonated with so many people because we all know how much cruft and complexity has accrued in C++. And the fact that the quote resonated so much I think says more about C++ than whatever Bjarne's original intentions were with these specific words.
And so, even though Bjarne explicitly said otherwise, I think Java and C# were efforts to extract this smaller and cleaner language -- one that had C++-style object-oriented programming without the other bits that they considered the "cruft." And for a substantial slice of C++ programmers, who didn't need control of memory and could afford garbage collection, this is exactly what the doctor ordered: Many use cases of C# an Java today would've previously been filled by C++.
Remember, C++ used to be a general-purpose programming language. Now, it's a niche systems programming languages. It has been edged out of other niches in large part by "C-like programming languages" that take what they like from it, and leave the rest, like Java and C#.
And I think Rust is finishing the job. It is a similar effort, but with a different opinion about which bits constitute the "cruft." Zero-cost abstraction remains a goal, but C compatibility does not. One of the goals of this book is to convince you that this is the right decision for a systems programming language.
Which brings me to this book's secondary thesis:
With a few notable exceptions, between Rust and C++, Rust is the better language to start a new project in. If you were going to write something new in C++, I think you should almost always use Rust instead, or else another programming language if it's outside of Rust/C++'s core niche.
This is a corollary of this book's primary thesis: if Rust has the same goals as C++, and accomplishes them just as well with fewer downsides and fewer costs, why would you use C++? In my opinion, the exceptions are already very limited, and will only decrease with time, until C++ is only appropriate for -- and ultimately only used for -- legacy projects.
Again, C++ is one of the giants on whose shoulders Rust stands. Without C++, Rust would've been impossible, just like Java and C# would've been impossible. But sometimes a clean, breaking re-design is required, and Rust provides exactly that.
RAII: GC without GC
I don't want you to think of me as a hater of C++. In spite of the fact that this book itself is a comparison between Rust and C++ in Rust's favor, I am very aware that Rust as it exists would never have been possible without C++. Like all new technology and science, Rust stands on the shoulders of giants, and many of those giants contributed to C++.
And this makes sense if you think about it. Rust and C++ have very similar goals. The C++ community has done a lot over all these years to pioneer new programming language features in line with those goals. C++ has then given these features years to mature in its humongous ecosystem. And because Rust also doesn't have to be compatible with C++, it can then steal those features without some of the caveats they come with in C++.
One of the biggest such features -- perhaps the biggest one -- is RAII, C++'s and now Rust's (somewhat oddly-named) scope-based feature for resource management. And while RAII is for managing all kinds of resources, its biggest use case is as part of a compile-time alternative to run-time garbage collection and reference counting.
As an alternative to garbage collection, RAII has deficits. While many allocations are created and freed neatly in line with variables coming in and out of scope, sometimes that's not possible. To fully compete with garbage collection and capture the diverse ways programs use the heap, RAII needs to be combined with other features.
And C++ has done a lot of this. C++ added move semantics in C++11, which Rust also has -- though cleaner in Rust because Rust was designed with them from the start and so it can pull off destructive moves. C++ also has opt-in reference counting, which, again, Rust also has.
But C++ still doesn't have lifetimes (Rust got that from Cyclone, which called them "regions"), nor the infamous borrow checker that goes along with them in Rust. And even though the borrow checker is perhaps the most hated part of Rust, in this post, I will argue that it brings Rust's RAII-centric compile-time memory management system much closer to feature-parity with run-time reference counting and other run-time garbage-collection technologies.
I will start by talking about the problem that RAII was originally designed to solve. Then, I will re-hash the basics of how RAII works, and work through memory usage patterns where RAII needs to be combined with these other features, especially the borrow checker. Finally, I will discuss the downsides of these memory management techniques, especially performance implications and handling of cyclic data structures.
But before I get into the weeds, I have some important caveats:
Caveat: No Turing-complete programming language can completely prevent memory leaks. Even in fully-GC'd languages, you can still leak memory by filling up a data structure with increasing amounts of unnecessary data. This can be done by accident, especially when sophisticated callback systems are combined with closures. This is out of the scope of this post, which only concerns memory management issues that automated GC can actually help with.
Caveat #2: Rust allows you to leak memory on purpose, even when a garbage collector would have reclaimed it. In extreme circumstances, the reference counting system can be abused to leak memory as well. This fact has been used in anti-Rust rhetoric to imply its memory safety system is somehow worthless.
For the purposes of this post, we assume a programmer who is trying to get actual work done and needs help not leaking memory or causing memory corruption, not an adversarial programmer trying to make the system leak on purpose.
Caveat #3: RAII is a terrible name. OBRM (Ownership-Based Resource Management) is used in Rust sometimes, and is a much better name. I call it RAII in this article though, because that's what most people call it, even in Rust.
The Problem: Manual Memory Management is Hard, GC is "Slow"
So. C-style manual memory management -- "just call free
when you're
done with the allocation" -- is error prone.
It is error prone when it is easy and tedious, because programmers can
make stupid mistakes and just forget to write free
and it isn’t
immediately broken. It is error prone when multiple programmers work
together, because they might make different assumptions about who is
supposed to free something. It is error prone when multiple parts of the
code need to use the same data, especially when that usage changes with
new requirements and new features.
And the consequences of doing it wrong are not just memory leaks. Use-after-free can lead to memory corruption, and bugs in one part of the program can abruptly show up when allocation patterns change somewhere else entirely.
This is a problem that can be solved with discipline, but like many tedious clerical disciplines, it can also be solved by computer.
It can be solved at run-time, which is what garbage collection and reference counting do. These systems do two things:
- They keep allocations from lasting too long. When memory becomes unreachable, it can be reclaimed. This prevents memory leaks.
- They keep allocations from being freed early. If memory is still reachable, it will still be valid. This prevents memory corruption.
And for most programmers and applications, this is good enough. And so for almost all modern programming languages, this run-time cost is well worth not troubling the programmer with the error-prone tedious tasks of C-style manual memory management, enabling memory safety and resource efficiency at the same time.
Caveat: To be clear, "slow" here is an oversimplification, and I address that more later. I mean it as a tongue-in-cheek way of saying that it has performance costs, whereas Rust and C++ try to adhere to a zero-cost principle.
GC (including RC) Has Costs
But there are costs to having the computer do memory management at run-time.
I lump mark-sweep garbage collection and reference counting together here. Both mark-sweep garbage collection and reference counting have costs above C-style manual memory management that make them unacceptable according to the zero-cost principle. GC comes with pauses, and additional threads, in the best case. RC comes with myriad increments and decrements to a reference count. These costs might be small enough to be okay for your application -- and that's well and good -- but they are costs, and therefore they can't be the main memory management model in C++ or Rust.
This is a complicated issue, and so before continuing, here comes another caveat:
Caveat: GC is not necessarily slower, but it does have performance implications that are often unacceptable for situations where C++ (or Rust) is used. To achieve its full performance, it needs to be enabled for the entire heap, and that has costs associated with it. For these reasons, C++ and Rust do not use GC. The details of these performance trade-offs are beyond the scope of this blog post.
A Dilemma
But C++ and Rust are not most programming languages. They face a dilemma:
- On the one hand, manual memory management is unacceptably error prone for a high level language, a detail the computer should be able to handle for you.
- On the other hand, run-time garbage collection violates a fundamental goal that C++ and Rust share: the zero-cost principle. Code written in these languages is supposed to be as performant as the equivalent manually-written C. To conform to that principle, reference counting (or GC) have to be opt-in (because, after all, sometimes manually written C code does use these technologies).
So, for the vast majority of situations, where a C programmer wouldn't use reference counting (or mark-sweep), Rust and C++ need something more sophisticated. They need tools to prevent memory management mistakes -- that is, to at least partially automate this tedious and error-prone task -- without sacrificing any run-time performance.
And this is the reason C++ invented (and Rust appropriated) RAII. Instead of addressing the problem at run-time, RAII automates memory management at compile-time. Analogous to how templates and trait monomorphization can bring some but not all of the power of polymorphism without many of the run-time costs, RAII brings some but not all of the power of garbage collection without constant reference count updates or GC pauses.
But as we will see, RAII as C++ implements it only solves one of the two problems addressed by garbage collection: leaks. It cannot address memory corruption; it cannot keep allocations alive long enough for all the code that could possibly need to use it.
Raw RAII: How RAII Works on its Own
The simplest use case for RAII is underwhelming: it automatically inserts
calls to free up heap allocations at the end of the block where
we made the allocation. It replaces a malloc
/free
sandwich
from C with simply the allocation side, by inserting an
implicit (and unwritten) call to a destructor, which in its
simplest version is an equivalent of free
. And if that
was all RAII did, it wouldn't be that interesting.
For example, take this C-style (no RAII) code:
void print_int_little_endian_decimal(int foo) {
// Little endian decimal print of `foo`
// i.e. backwards from how we normally write decimal numbers
// e.g. 831 prints out as "138"
// Big endian would be too hard
// Little endian is as always actually simpler platonically,
// if somehow not for humans.
// Yes, this only works for positive ints. It's an example.
char *buffer = malloc(11);
for(char *it = buffer; it < buffer + 10; ++it) {
*it = '0' + foo % 10;
foo /= 10;
if (foo == 0) {
it[1] = '\0';
break;
}
}
puts(buffer); // put-string, not the 3sg verb form "puts"
free(buffer); // Don't forget to do this!
}
Just using RAII (and unique_ptr
s, which are an essential part of
the RAII model), but using no other features of C++, we get this very
unidiomatic and unimpressive version:
void print_int_little_endian_decimal(int foo) {
std::unique_ptr<char[]> buffer{new char[11]};
for(char *it = &buffer[0]; it < &buffer[10]; ++it) {
*it = '0' + foo % 10;
foo /= 10;
if (foo == 0) {
it[1] = '\0';
break;
}
}
puts(&buffer[0]);
}
It doesn't help us with our random guess of an appropriate buffer size, our awkward redundant attempts to avoid a buffer-overflow, or with any abstraction over the fact that we're trying to implement a collection.
In fact, it makes the code more awkward, for a benefit that seems hardly
worth it, to just automatically call free
at the end of the block --
which might not even be where we want to call free! We could instead
have wanted to return the data to the caller, or inserted it into
a bigger, greater data structure, or similar.
It's a bit less ugly when you use C++'s abstractions. Destructors
don't have to just call free
(or rather its C++ analogue delete
) as
unique_ptr
's does. Any C programmer can tell you that idiomatic
C code is rife with custom free functions to free all of the allocations
of a data structure, and C++ (and Rust) will choose which destructor
to call for you based on the type of the data. Calling free
when
a custom destructor must be called is a common careless mistake in C.
This is true especially among beginners, and (hot take!) making
programming languages less needlessly tricky for beginners is a good
thing for everybody.
We can combine RAII with other features of C++ to get this more
idiomatic code, with the first do
-while
loop I've written
in years:
void print_int_little_endian_decimal(int foo) {
std::string res;
do {
res += '0' + foo % 10;
foo /= 10;
} while (foo != 0);
std::cout << res << std::endl;
}
Does std::string
allocate memory on the heap? Maybe it only does
if the string goes above a certain size. But the custom destructor,
~std::string
, will call delete[]
only when the allocation was actually
made, abstracting that question away, along with handling terminating
nuls and avoiding overruns in a cleaner way.
This ability of RAII -- to call custom destructors that abstract away allocation decisions -- gets more impressive when we consider that many data structures don't make just 0 or 1 heap allocations, but whole complicated trees of complicated heap allocations. In many cases, C++ (and Rust) will write your destructors for you, even for complicated types like this:
struct PersonRecord {
std::string name;
uint64_t salary;
};
std::unordered_map<std::string, std::vector<PersonRecord>> thing;
To destroy thing
in C, you'd have to loop through the hash map, free
all the keys, and then free all the values, which then requires freeing
all the strings in each PersonRecord
before freeing the backing for
each vector. Only then could you free the actual allocations backing the
hash map.
And perhaps a C-based hash map library could do this for you, but only
by assuming that the keys are strings, and then taking a function
pointer to know how to free the values, which would ironically
be a form of dynamic polymorphism and therefore a performance hit. And
the function to free the values would then still have to manually free the
string, knowing which field of the PersonRecord
was a pointer and
duplicating that information between the structure and the manually-written
"free" function, and still likely not supporting the small-string
optimization that C++ enables.
In C++, this freeing code is all automatically generated. PersonRecord
gets an automatic destructor that calls the destructor of each
field (int
's destructor is trivial), and the destructors of
std::unordered_map
and std::vector
are templated so that, at compile
time, a fresh destructor is built from those templates that handles all
of this, all without any indirect function calls or run-time cost beyond
what manually would be written for exactly this data structure in C.
See, with RAII, a destructor isn't just automatically and implicitly called at the end of a scope in a function, but also in the destructors of values ("objects" in C++) that own other values. Even if you do write a custom destructor for aggregate types, that just specifies what the computer should do on destruction beyond the automatic calls to the destructors of the fields, which are still implicit.
Ownership and its limitations
This is all possible based on the concept of "ownership," one of the key principles of RAII. The key assumption is that every allocation has one owner at any given time. Allocations can own each other (forming a tree of allocations), or a scope can own an allocation (forming the root of such a tree). RAII then can make sure the allocation ends when its owner does -- by the scope exiting, or when the owning object is destroyed.
But what if the allocation needs to outlive its parent, or its scope? It's not always the case that a function has primitive types as its arguments and return value, and then only constructs trees of allocations privately. We need to take these sophisticated collections and pass them as arguments to functions. We need to have them be returned from functions.
This becomes apparent if we try to refactor our big-endian integer decimalizer to allow us to do other things with the resultant string besides print it:
std::string render_int_little_endian_decimal(int foo) {
std::string res;
do {
res += '0' + foo % 10;
foo /= 10;
} while (foo != 0);
return res;
}
int main() {
std::cout << render_int_little_endian_decimal(3781) << std::endl;
return 0;
}
Based on our previous discussion of RAII, you might assume that
the ~std::string
destructor is called on the end of its scope,
rendering the allocation unusable for later printing, but instead
this code "Just Works."
We've hit one of many mitigations against the limitations of raw
RAII that are necessary for it to work. This mitigation is the "Named
Return Value Optimization (NRVO)," which stipulates that if a named
variable is used in all of the return
statements in a function, it is
actually constructed (and destructed) in the context of the caller. It is
misnamed an "optimization" because it's actually part of the semantics:
It eliminates entirely the call to the destructor at the end of the scope,
even if that destructor call would have side effects.
This is just one of many ways RAII is made competitive with run-time garbage collection, and we can have values that live outside of a certain scope of a function. This one is narrow and peculiar to C++, but many of the others lead to interesting comparisons. In the next section, we discuss the others.
Filling the Gaps in RAII
Copying/Cloning
We're going to start with one of the oldest of these: copying.
When C++ was designed, the intention was that the programmer would
not see a difference between types that don't involve allocation
(like int
or double
) and types that do (like std::string
or
std::unordered_map<std::string, std::vector<std::string>>
.
When a function takes an int
argument, as in
print_int_little_endian_decimal
, that integer is copied.
Similarly, if we take a std::string
argument without additional
annotation, C++ will also make a copy:
int parse_int_le(std::string foo) {
int res = 0;
int pos = 1;
for (char c: foo) {
res += (c - '0') * pos; // No input validation -- example!
pos *= 10;
}
return res;
}
int main(int argc, char **argv) {
std::string s = argv[1];
std::cout << parse_int_le(s) << std::endl;
return 0;
}
This is indeed consistent. Treating int
s and std::string
objects
in parallel ways is also in line with how higher-level programming
languages sometimes work: a string is a value, an int
is a value,
why not give them the same semantics? Aliasing is confusing, why not
avoid it with copying?
It's made to work by an implicit function call. Just like destructor
calls are implicit in C++, copying also calls a function in the types
implementation. Here, it calls std::string
's "copy constructor."
The problem here is that this is slow. Not only is an unnecessary
copy made, but an unnecessary allocation and deallocation creep in.
There is no reason not to use the same allocation the caller already
has, here in s
from the main
function. A C programmer would never
write this copying version.
The only reason this feature is allowed under C++'s zero-cost principle is because it is optional. It may be the default -- and making it the default is one of the most questionable decisions C++ ever made -- but we can still alias if we want to. It just takes more work.
Rust, as you can guess by my tone, requires explicit annotation to
copy types that have an allocation. In fact, Rust doesn't even use the
term "copy," which is reserved for types that can be copied without
allocations. It calls this cloning, and requires use of the clone()
method to accomplish it.
Some types don't use an allocation, and "copying" them is just a simple memory copy. Some types do use an allocation, and "cloning" them requires allocating. This distinction is important and fundamental to how computers work. It's relevant and visible in Java and even Python, and pretending it doesn't exist is unbecoming for a systems programming language like C++.
Moves
Returning an allocation from a function can't always use NRVO. So if you want your value to outlast your function, but it's created inside the function (and therefore "owned" by the function scope), what you really need is a way for the value to change owners. You need to be able to move the value from the scope into the caller's scope. Similarly, if you have a value in a vector, and need to remove the last value, you can move it.
This is distinct from copying, because, well, no copy is made -- the allocation just stays the same. The allocation is "moved" because the previous scope no longer has responsibility for destroying the allocation, and the new scope gains the responsibility.
Move semantics fix the most serious issue with RAII: your allocation might not live exactly as long as its owner. The root of an allocation tree might outlive the stack-based scope it's in, such as when you want to return a collection from a function. The other nodes of an allocation tree might leave that tree and be owned by another stack frame, or by another part of the same allocation tree, or by a different allocation tree. In general, "each allocation has a unique owner" becomes "each allocation has a unique owner at any given time," which is much more flexible.
In Rust, this is done via "destructive moves," which oddly enough means not calling the destructor on the moved-from value. In fact, the moved-from value ceases to be a value when it's moved from, and accessing that variable is no longer permitted. The destructor is then called as normal in the place where the value is moved to. This is tracked statically at compile-time in the vast majority of situations, and when it cannot be, an extra boolean is inserted as a "drop flag" ("drop" is how Rust refers to its destructors).
C++ didn't add move semantics until C++11; it was not part of the original RAII scheme. This is surprising given how essential moves are to RAII. Returning collections from functions is super important, and you can't copy every time. But before C++, there were only poor man's special cases for move, like NRVO and the related RVO for objects constructed in the return statement itself. These have completely different semantics than C++ move semantics -- they're still more efficient than C++ moves in many cases.
When C++ did eventually add moves, the other established semantics of C++ forced it to add moves in a weird and deeply confusing way: it added "non-destructive" moves. In C++, rather than the drop flag being a flag inserted by the compiler, it is internal to the value. Every type that supports moves must have a special "empty state," because the destructor is called on the moved-from value. If the allocation had moved to another value, there would be no allocation to free, and this had to be handled by the destructor at run-time, which can amount to a violation of the zero-cost principle in some situations.
C++ justifies this by making moves a special case of copy. Moves are said to be like copies, but make no promises of preserving the initial value. In exchange, you might get the optimization of being able to use the original allocation, but then the initial value will not have an allocation, and will be forced to be different. This definition is very different than what moves are actually used for (cf. the name of the operation), and therefore, even though it is technically simple, claiming that focusing on that definition (as Herb Sutter does) will simplify things for the programmer is disingenuous, as I discuss in more detail in the next chapter on moves specifically.
In practice, this means that all types support the operation of moving --
even int
s -- but even some types that manage an allocation might fall
back on copying if moves haven't been implemented for them. This
inconsistency, like all inconsistencies, is bad for programmers.
In practice, this also means that moved-from objects are a problem. A moved-from object might stay the same, if no moving was done. It might also change in value, if the move caused an allocation (or other resource) to move into the new object. This forces C++ smart pointers to choose between movability and non-nullability -- no moveable, non-nullable pointer is possible in C++. Nulls -- and the other "moved-from" empty collections that you get from C++ move semantics -- can then be referenced later on in the function, and though they must be "valid" values of the object, they are probably not the values you expect, and in the case of null pointers, they are famously difficult values to reason about.
This is a consequence of the fact that C++ was a pioneer of RAII semantics, and didn't design RAII and moves together from the start. Rust has the advantage of having included moves from the beginning, and so Rust move semantics are much cleaner.
In Rust also, all types can be moved. But in Rust, no resources or
allocations are ever copied. Instead, moves always have the same
implementation: copy the memory that is stored in-line in the value
itself, and then do not call the destructor. For copyable types like
int
that do not manage an allocation or other resource, this does
amount to a copy, but the original is still not usable. But no allocation
or resource is ever copied; for those types, the pointer or handle is
simply brought along bit-by-bit just like other data, and the old value
is never touched again, making this a safe operation.
All types must then be written in such a way to assume that values might not stay in the same place in memory. If some operations on a type can't be written that way, they can be defined on "pinned" versions of that type. A pin is a type of reference or box that promises that the pointed-to value will never move again. The underlying type is still movable, but these particular values are not.
This is a gnarly exception to Rust's "all types can be moved" rule that make it false in practice, though still true in pedantic, language-lawyery theory. But that's not important. What is important is that Rust's move semantics are consistent, and do not rely on move constructors and manual implementations of Rust's drop flags within the object. The dangerous possibility of interacting with a moved-from object, whose value is unpredictable and quite possibly a special "empty" state like null, is not present in Rust.
Borrows in Rust
While moves cover returning a collection (or other resource-managing value) from a function, they don't cover passing such a value into a function, or at least not in the general case. Sometimes, when we pass a value into a function, we want to move the value in, so that the function can consume it or add it to an allocation tree (like inserting into a collection). But most times, we want the function to be able to see and perhaps mutate it, but then we want to give it back to the owner.
Enter the borrow.
In Rust, borrows are commonly introduced as a sort of an improvement on
moves. Consider our example function that parses a string to an int
,
here implemented in C++ with copies:
int parse_int_le(std::string foo) {
int res = 0;
int pos = 1;
for (char c: foo) {
res += (c - '0') * pos; // No input validation -- example!
pos *= 10;
}
return res;
}
Here is a Rust version, with moves, so that the function consumes the string:
use std::env::args; fn parse_int_le(foo: String) -> u32 { let mut res = 0; let mut pos = 1; for c in foo.chars() { res += (c as u32 - '0' as u32) * pos; pos *= 10; } res } fn main() { let mut args: Vec<String> = args().collect(); println!("{}", parse_int_le(args.remove(1))); }
As we can see with the "move" version of this, we are in the awkward
position of removing the string from the vector, so that parse_int_le
can consume the string, so it doesn't have multiple owners.
But parse_int_le
doesn't need to own the string. In fact, it could
be written so that it can give the string back when it's done:
#![allow(unused)] fn main() { fn parse_int_le(foo: String) -> (u32, String) { let mut res = 0; let mut pos = 1; for c in foo.chars() { res += (c as u32 - '0' as u32) * pos; pos *= 10; } (res, foo) } }
"Taking temporary ownership" in real life is also known as borrowing, and Rust has such a feature built-in. It is more powerful than the above code that literally takes temporary ownership, though. That code would have to remove the string from the vector and then put it back -- which is even more inefficient than just removing it. Rust borrowing allows you to borrow it even while it's inside the vector, and stays inside the vector. This is implemented by a Rust reference, which has this borrowing semantics, and is, like most "references," implemented as a pointer at the machine level.
In order to accomplish these semantics, Rust has its infamous borrow checker. While we are borrowing something inside the vector, we can't simultaneously be mutating the vector, which could cause the thing we're borrowing to move. Rust statically ensures that this is impossible, rejecting code that use a reference after a mutation, destruction, or move somewhere else would invalidate it.
This enables us to extend the RAII-based system and both prevent leaks and maintain safety, just like a GC or RC-based system. The borrow checker is essential to doing so.
For completeness, here is the idiomatic way to handle the
parameter in parse_int_le
, with an actual borrow, using
&str
, the special borrowed form of String
that also allows
slices:
use std::env::args; fn parse_int_le(foo: &str) -> u32 { let mut res = 0; let mut pos = 1; for c in foo.chars() { res += (c as u32 - '0' as u32) * pos; pos *= 10; } res } fn main() { let args: Vec<String> = args().collect(); println!("{}", parse_int_le(&args[1])); }
Dodging memory safety in C++
In C++, of course, there is no borrow checker. In the parse_int_le
example, it's still possible to use a pointer, or a reference, but then
you're on your own. When RAII-based code frees your allocation, your
reference is invalidated, which means it's undefined behavior to use it.
No coordination is performed by the compiler between the RAII/move
system and your references, which point into the ownership tree
with no guarantee that said tree won't move underneath it.
This can lead to memory corruption bugs, with security implications.
It's not just pointers and references. Other types that contain references, such as iterators, can also be invalidated. Sometimes those are more insidious because intermediate C++ programmers might know about pointer invalidation, but let their guard down with iterators. If you add to a vector while looping through it, you've just done undefined behavior, and that's surprising because no pointers or references even have to show up. Rust's borrow checker handles these as well.
Even though the Rust borrow checker gets a bad reputation, its safety guarantees often make it worth it. It's hard to write correct C++ when references and non-owning pointers are involved. Maybe some of you have that skill, and are unsympathetic to those who don't yet have it, but it is a specialized skill, and the compiler can do a lot of the work for you, by checking your work. Automation is a good thing, and so is making systems programming more accessible to beginners.
And of course, many C++ programmers do make mistakes. Even if it's not you, it might be one of your colleagues, and then you'll have to clean up the mess. Rust addresses this, and limits this more difficult mode of thinking to writing unsafe code, which can be contained in modules.
Multiple Ownership
In RAII, an allocation has one owner at a time, and if your owner is destroyed before the allocation is moved to another owner, the allocation must be destroyed along with it.
Of course, sometimes this isn't how your allocations work. Sometimes they need
to live until both of two parent allocations are destroyed, and sometimes
there is no way to predict which parent is destroyed first. Sometimes,
the only way to solve that situation -- even in C -- is to use runtime
information -- and so you can model multiple ownership through reference
counting: std::shared_ptr
in C++, or Rc
and Arc
in Rust (depending
on whether it is shared between multiple threads).
This is something that C programmers will sometimes do in the face of complicated allocation DAGs, and end up implementing bespoke on a framework-by-framework basis (cf. GTK+ and other C GUI frameworks). C++ and Rust are just standardizing the implementation of this, but, in line with the zero-cost rule, making it optional.
Interestingly enough, reference counting is implemented in terms of RAII and moves. The destructor for a reference-counted pointer decreases the reference, and cloning/copying such a pointer increases it. Moves, of course, don't change it at all.
RAII+: What this all adds up to
Between RAII, moves, reference counting, and the borrow checker, we now have the memory management system of safe Rust. Safe Rust is a powerful programming language, and in it, you can write programs almost as easily as in a traditionally GC'd programming language like Java, but get the performance of manually written, manually memory managed C.
The cost is annotation. In Java, there is no distinction between "borrowing" and "owning", even though sometimes the code follows similar structures as if there were. In Rust, the compiler must be informed about the chain of owners, and about borrowers. Every time an allocation crosses scope boundaries or is referred to inside another allocation, you must write different syntax to tell Rust whether it's a move or a borrow, and it must comply with the rules of the borrow checker.
But it turns out most code has a natural progression of owners, and most borrows are valid in the borrow checker. When they're not, it's usually straight-forward to rethink the code so that it can work that way, and the resultant code is usually cleaner anyway. And in situations where neither of them work, reference counting is still an option.
At the cost of this annotation, Rust gives you everything a GC does: Allocations are freed when their handles go out of scope, and memory safety is still guaranteed, because the annotations are checked. Memory leaks are as difficult as in a reference counting language, and the annotations are checked, which is most of the benefit of automating them. It's an excellent happy medium between manual memory management and full run-time GC with no run-time cost over a certain discipline of C memory management.
Of course, other disciplines of C memory management are possible. And using this Rust system takes away flexibility that might be relevant to performance. Rust, like C++, allows you to sidestep the "compile-time GC" and use raw pointers, and that can often be better for performance. A recent blog post I read explores some of that in more detail; encouragingly, that blog post also considers RAII to be in-between manual memory management and run-time GC -- serendipitously, because I had already drafted much of this post when it came out.
But the standard memory management tools of Rust cover the common cases well, and unsafe is available for when it's inappropriate -- and can be wrapped in abstractions for interfacing with code that uses the RAII-based system.
In C++, the annotations of "borrows" vs "moves" can easily result in undefined behavior. Leaks are prevented, but memory corruption is not. So the C++ system is a much worse replacement for garbage collection -- RAII is only doing some of its job, as it is not paired with a borrow checker.
Cycles
I leave the most awkward topic for the end. We've talked about allocation
trees and DAGs, but not general graphs. These require unsafe
in Rust,
even something as supposedly basic as doubly linked lists. It's against
the borrow checker's rules, and the compiler will statically prevent
you from making them using safe, borrowing references. They simply aren't
borrows in the Rust sense, but are rather something else, something
about which Rust doesn't know how to guarantee safety.
This is not as bad as you might think, because cycles also form a hole in
reference counting, which is a popular run-time GC system. This is why
you can't use Rc
or Arc
to implement a doubly-linked list correctly in
Rust either: You'll get past the borrow checker and guarantee a memory
leak.. These systems generally can't detect cycles at all, and leak them,
which is arguably worse than forbidding them to be created.
In any case, the unsafe
keyword is not poison. For things that Rust
doesn't know how to keep safe, you need to exercise extra responsibility,
but at least the programming language is making you aware of it --
unlike C++, which is unsafe all the time.
Moves
As we discussed in the previous chapter, moves are an essential part of an RAII-based system of memory management, allowing RAII-controlled types to have multiple owners in the course of their lifetime. In this chapter, we discuss moves outside of that context and provide an alternative justification for why they're important. We then go into a little more depth about why C++ moves can be confusing, and explain how the Rust implementation has fewer footguns and in general is more in line with the goals of the feature.
History
In 2011, C++ finally fixed a set of long-standing deficits in the programming language with the shiny new C++11 standard, bringing it into the modern era. Programmers enthusiastically pushed their companies to allow them to migrate their codebases, champing at the bit to be able to use these new features. Writers to this day talk about "modern C++," with the cut-off being 2011. Programmers who only used C++ pre-C++11 are told that it is a new programming language, the best version of its old self, worth a complete fresh try.
There were a lot of new features to be excited about. C++ standard threads were added then -- and thread standardization was indeed good, though anyone who wanted to use threads before likely had their choice of good libraries for their platform. Closures were also very exciting, especially for people like me who came from functional programming, but to be honest, closures were just syntactic sugar for existing patterns of boilerplate that could be readily used to write function objects.
Indeed, the real excitement at the time, certainly the one my colleagues and I were most excited about, was move semantics. To explain why this feature was so important, I'll need to talk a little about the C++ object model, and the problem that move semantics exist to solve.
Value Semantics
Let's start by talking about a primitive type in C++: int
. Objects --
in C++ standard parlance, int
values are indeed considered objects --
of type int
only take up a few bytes of storage, and so copying them
has always been very cheap. When you assign an int
from one variable
to another, it is copied. When you pass it to a function, it is copied:
int print_i(int arg) {
arg += 3;
std::cout << arg << std::endl;
}
int foo = 3;
int bar = foo; // copy
foo += 1; // foo gets 4
std::cout << bar << std::endl; // bar is still 3
print_i(foo); // prints 4+3 ==> 7
std::cout << foo << std::endl; // foo is still 4
As you can see, every variable of type int
acts independently of each
other when mutated, which is how primitive types like int
work in many
programming languages.
In the C++ version of object-oriented programming, it was decided that values of custom, user-defined types would have the same semantics, that they would work the same way as the primitive types. So for C++ strings:
std::string foo = "foo";
std::string bar = foo; // copy (!)
foo += "__";
bar += "!!";
std::cout << foo << std::endl; // foo is "foo__"
std::cout << bar << std::endl; // bar is "foo!!"
This means that whenever we assign a string to a new variable, or
pass it to a function, a copy is made. This is important, because
the std::string
object proper is just a handle, a small structure
that manages a larger memory allocation on the heap, where the
actual string data is stored. Each new std::string
that is made
via copy requires allocating a new heap allocation, a relatively
expensive operation in performance.
This would cause a problem when we want to pass a std::string
to a
function, just like an int
, but don't want to actually make a copy
of it. But C++ has a feature that helps with that: const
references.
Details of the C++ reference system are a topic for another post, but
const
references allow a function to operate on the std::string
without the need for a copy, but still promising not to change the
original value.
The feature is available for both int
and std::string
; the principle
that they're treated the same is preserved. But for the sake of performance,
int
s are passed by value, and std::string
s are passed by const
reference in the same situation. In practice, this dilutes the benefit
of treating them the same, as in practice the function signatures
are different if we don't want to trigger spurious expensive deep copies:
void foo(int bar);
void foo(const std::string &bar);
If you instead declare the function foo
like you would with an int
,
you get a poorly performing deep copy. The default is something you
probably don't want:
void foo(std::string bar);
void foo2(const std::string &bar);
`
std::string bar("Hi"); // Make one heap allocation
foo(bar); // Make another heap allocation
foo2(bar); // No copy is made
This is all part of "pre-modern" C++, but already we're seeing negative
consequences of the decision to treat int
and std::string
as identical
when they are not, a decision that will get more gnarly when applied to
moves. This is why Rust has the Copy
trait to mark types like i32
(the Rust equivalent of int
) as being copyable, so that they can be
passed around freely, while requiring an explicit call to clone()
for types like String
so we know we're paying the cost of a deep copy,
or else an explicit indication that we're passing by reference:
#![allow(unused)] fn main() { fn foo(bar: String) { // Implementation } fn foo2(bar: &str) { // Implementation } let bar = "hi".to_string(); foo(bar.clone()); foo2(&bar); }
The third option in Rust is to move, but we'll discuss that after we discuss moves in C++.
Copy-Deletes and Moves
C++ value semantics break down even more when we do need the
function to hold onto the value. References are only valid as long
as the original value is valid, and sometimes a function needs it
to stay alive longer. Taking by reference is not an option when
the object (whether int
or std::string
) is being added to a vector
that will outlive the original object:
std::vector<int> vi;
std::vector<std::string> vs;
{
int foo = 3;
foo += 4;
vi.push_back(foo);
} // foo goes out of scope, vi lives on
{
std::string bar = "Hi!";
bar += " Joe!";
vs.push_back(bar);
} // bar goes out of scope, vs lives on
So, to add this string to the vector, we must first make an
allocation corresponding to the object contained in the variable
bar
, and then must make a new allocation for the object that lives in
vs
, and then copy all the data.
Then, when bar
goes out of scope, its destructor is called, as is
done automatically whenever an object with a destructor goes out
of scope. This allows std::string
to free its heap allocation.
Which means we copied an allocation into a new heap allocation, just to free
the original allocation. Copying an allocation and freeing the old one
is equivalent to just re-using the old allocation, just slower. Wouldn't
it make more sense to make the string in the vector just refer to the
same heap allocation that bar
formerly did?
Such an operation is referred to as a "move," and the original C++ --
pre C++11 -- didn't support them. This was possibly because they didn't
make sense for int
s, and so they were not added for objects that were
trying to act like int
s -- but on the other hand, destructors were
supported and int
s don't need to be destructed.
In any case, moves were not supported. And so, objects that managed resources -- in this case, a heap allocation, but other resources could apply as well -- could not be put onto vectors or stored in collections directly without a copy and delete of whatever resource was being managed.
Now, there were ways to handle this in pre-C++11 days. You could add
an indirection, and make a heap allocation to contain the std::string
object, which is only a small object with a pointer to another allocation,
but would at least let you pass around a std::string *
which is a
raw pointer that would not trigger all these copies by automatically
managing the heap allocation with this façade of value semantics. Or
you could manually manage a C-style string with char *
.
But the most ergonomic, clear std::vector<std::string>
could not be
used without performance degradation. Worse, if the vector ever needed
to be resized, and had to itself switch to a different allocation, it
would have to copy all those std::string
objects internally and
delete the originals, N useless reallocations.
As a demonstration of this, I wrote a sample
program with a vastly simplified
version of std::string
, that tracks how many allocations it makes.
It allows C++11-style moves to be enabled or disabled, and then it
takes all the command line arguments, creates string
objects out
of them, and puts them in a vector. For 8 command line arguments,
the version with move made, as you might expect, 8 allocations,
whereas the version without the move, that just put these strings
into a vector, made 23. Each time a string was added to a vector,
a spurious allocation was made, and then N spurious allocations
had to be made each time the vector doubled.
This problem is purely an artifact of the limitations of the tools provided by C++ to encapsulate and automatically manage memory, RAII and "value semantics."
Consider this snippet of code:
std::vector<std::string> vec;
{ // This might take place inside another function
// Using local block scope for simplicity
std::string foo = "Hi!";
vec.push_back(foo);
}
{
std::string bar = "Hello!";
vec.push_back(bar);
}
// Use the vector
If we didn't use this string
class, we would then
have not done a copy, just to free the original allocation. We would
have simply put the pointer into the vector. We would then have been
responsible for freeing all the allocations -- once -- when we're done:
std::vector<char *> vec;
{
// strdup, a POSIX call, makes a new allocation and copies a
// string into it, here used to turn a static string into one
// on the heap. We will assume we have a reason to store it
// on the heap -- perhaps we did more manipulation in the
// real application to generate the string.
// The allocation is necessary to be the direct equivalent of
// `vec.push_back("Hi")` or even `vec.emplace_back("Hi")` for
// a `std::vector<std::string>, because that data structure has
// the invariant that all strings in the vector must have their
// own heap allocation (assuming no small string optimization,
// which many strings are ineligible for).
char *foo = strdup("Hi!");
vec.push_back(foo);
}
{
char *bar = strdup("Hello!");
vec.push_back(bar);
}
// Use the vector
// Then, later, when we are done with the vector, free all the elements once
for (char *c: vec) {
free(c);
}
The copy version of the C++ code instead does -- after de-sugaring the RAII and value semantics and inlining -- something that no programmer would ever write manually, something equivalent to this (the vector is left in OOP notation for readability):
std::vector<char *> vec;
{
char *foo = strdup("Hi");
vec.push_back(strdup(foo)); // Why the additional allocate-and-copy?
free(foo); // Because the destructor of foo will free the original
}
{
char *bar = strdup("Hello!");
vec.push_back(strdup(bar));
free(bar);
}
// Use the vec
for (char *c: vec) {
free(c);
}
C++ without move semantics fails to reach its goal of zero-cost abstraction.
The version with the abstraction, with the value semantics, compiles to code
less efficient than any code someone would write manually, because what
we really want is to allocate the allocation while it's a local variable
foo
, use the same allocation on the vector, and then only free it on the
vector.
The abstractions of only supporting "copy" and "destruct" mean that the
destructor of the variable foo
must be called when foo
goes out of
scope. This means that the "copy" operation must make an independent
allocation, as it cannot control when the original goes out of scope,
or will be replaced with another value. If we had instead re-used the
same allocation, it would be freed by foo
s destructor.
But copying just to destroy the original is silly -- silly and ill-performant. What any programmer would naturally write in that situation results in a "move". So this gap -- and it was a huge gap -- in C++ value semantics was filled in C++11 when they added a "move" operation.
Because of this addition, using objects with value semantics that managed resources became possible. It also became possible to use objects with value semantics for resources that could not meaningfully be copied, like unique ownership of an object or a thread handle, while still being able to get the advantages of putting such objects in collections and, well, moving them. Shops that previously had to work around value semantics for performance reasons could now use them directly.
It is not, therefore, surprising that this was for many the most exciting change in C++11.
How Move Is Implemented in C++
But for now, let's put ourselves in the place of the language designers who designed this new move operation. What should this move operation look like? How could we integrate it into the rest of C++?
Ideally, we would want it to output -- after inlining -- exactly the
code that we would expect to write manually. When foo
is moved into the
vector, the original allocation must not freed. Instead, it is only freed
when the vector itself is freed. This is an absolute necessity to solve
the problem as we must remove a free in order to remove the allocation,
but we also cannot leak memory. If there is to be exactly one allocation,
there must be exactly one deallocation.
Calls to free
(or delete[]
in my example program) are made in the
destructor, so the most straight-forward way to go forward is to say
that the destructor should only be called when the vector is destroyed,
but not when foo
goes out of scope. If foo
is moved onto the vector,
then the compiler should take note that it has been moved from, and simply
not call the destructor. The move should be treated as having already
destroyed the object, as an operation that accomplishes both initialization
of the new object (the string on the vector) from the original object and the
destruction of the original object.
This notion is called "destructive move," and it is how moves are done
in Rust, but it is not what C++ opted for. In Rust, the compiler would
simply not output a destructor call (a "drop" in Rust) for foo
because
it has been moved from. But, in fact, the C++ compiler still does. In
destructive move semantics, the compiler would not allow foo
to be
read from after the move, but in fact, the C++ compiler still does,
not just for the destructor, but for any operation.
So how is the deallocation avoided, if the compiler doesn't remove it in this situation? Well, there is a decision to make here. If an object has been moved from, no deallocation should be performed. If it has not, a deallocation should be performed. Rust makes this decision at compile-time (with rare exceptions where it has to add a "drop flag"), but C++ makes it at run-time.
When you write the code that defines what it means to move from an object in C++, you must make sure the original object is in a run-time state where the destructor will still be called on it, and will still succeed. And, since we established already that we must save a deallocation by moving, that means that the destructor must make a run-time decision as to whether to deallocate or not.
The more C-style post-inlining code for our example would then look something like this:
std::vector<char *> vec;
{
char *foo = strdup("Hi!");
vec.push_back(foo);
foo = nullptr;
if (foo != nullptr) {
free(foo);
}
}
{
char *bar = strdup("Hi!");
vec.push_back(bar);
bar = nullptr;
if (bar != nullptr) {
free(bar);
}
}
This null check is hidden by the fact that in C++, free
and delete
and friends are defined to be no-ops on null, but it still exists.
And while the check might be very cheap compared to the cost of calling
free
, it might not be cheap when things are moved in a tight loop,
where free
is never actually called. That is to say, this
run-time check is not cheap compared to the cost of not calling free
.
So, given the semantics of move in C++, it results in code that is not the same as -- and not as performant as -- the equivalent hand-written C-style code, and therefore it is not a zero-cost abstraction, and doesn't live up to the goals of C++.
Now, it looks like the optimizer should be able to clean up an adjacent set to null and check for null, but not all examples are as simple as this one, and, like in many situations where the abstraction relies on the optimizer, the optimizer doesn't always get it.
Arguing Semantics
But that performance hit is small, and it is usually possible to optimize out. If that were the only problem with C++ move semantics, I might find it annoying, but ultimately I'd say, like about many things in about both C++ and Rust, something like: Well, this decision was made, remember to profile, and if you absolutely have to make sure the optimizer got it in a particular instance, check the assembly by hand.
But there's a few further consequences of that decision.
First off, the resource might not be a memory allocation, and null pointers might not be an appropriate way to indicate that that resource doesn't exist. This responsibility of having some run-time indication of what resources need to be freed -- rather than a one-to-one correspondence between objects and resources -- is left up to the implementors of classes. For heap allocations, it is made relatively easy, but the implementor of the class is still responsible for re-setting the original object. In my example, the move constructor reads:
string(string &&other) noexcept {
m_len = other.m_len;
m_str = other.m_str;
other.m_str = nullptr; // Don't forget to do this
}
The move constructor has two responsibilities, where a destructive version would only have one: It must set up state for the new object, and it must set up a valid "moved from" state for the old object. That second obligation is a direct consequence of non-destructive moves, and provides the programmer with another chance to mess something up.
In fact, since destructive moves can almost always be implemented by just copying the memory (and leaving the original memory as garbage data as the destructor will not be called on it), a default move constructor would correctly cover the vast majority of implementations, creating even fewer opportunities to introduce bugs.
But in C++, the moved-from state also has obligations. The destructor has to know at run-time not to reclaim any resources if the object no longer has any, but in general, there is no rule that moved-from objects must immediately be destroyed. The programming language has explicitly decided not to enforce such a rule, and so, to be properly safe, moved-from objects must be considered -- and must be -- valid values for those objects.
This means that any object that manages a resource now must manage either 1 or 0 copies of that resource. Collections are easy -- moved from collections can be made equivalent to the "empty" collection that has no element. For things like thread handles or file handles, this means that you can have a file handle with no corresponding file. Optionality is imported to all "value types."
So, smart pointer types that manage single-ownership heap allocations, or
any sort of transferrable ownership of heap allocations, now of necessity
must be nullable. Nullable pointers are a serious cause of errors, as
often they are used with the implicit contract that they will not be null,
but that contract is not actually represented in the type. Every time a
nullable pointer is passed around, you have a potential miscommunication
of whether nullptr
is a valid value, one that will cause some sort
of error condition, or one that may lead to undefined behavior.
C++ move semantics of necessity perpetuate this confusion. Non-nullable smart pointers are unimplementable in C++, not if you want them to be moveable as well.
Move, Complicatedly
This leads me to Herb Sutter's explanation of C++ move semantics from his blog. I respect Herb Sutter greatly as someone explaining C++, and his materials helped me learn C++ and teach it. An explanation like this is really useful if programming in C++ is what you have to do.
However, I am instead investigating whether C++'s move semantics are reasonable, especially in comparison to programming languages like Rust which do have a destructive move. And from that point of view, I think this blog post, and its necessity, serve as a good illustration of the problems with C++'s move semantics.
I shall respond to specific excerpts from the post.
C++ “move” semantics are simple, and unchanged since C++11. But they are still widely misunderstood, sometimes because of unclear teaching and sometimes because of a desire to view move as something else instead of what it is.
Given the definition he's about to give of C++ move semantics, I think this is unfair. The goal of move is clear: to allow resources to be transferred when copying would force them to be duplicated. It is obvious from the name. However, the semantics as the language defines them, while enabling that goal, are defined without reference to that goal.
This is doomed to lead to confusion, no matter how good the teaching is. And it is desirable to try to understand the semantics as they connect to the goal of the feature.
To explain what I mean, see the definition he then gives for moving:
In C++, copying or moving from an object
a
to an objectb
setsb
toa
’s original value. The only difference is that copying froma
won’t changea
, but moving froma
might.
This is a fair statement of C++'s move semantics as defined. But it has a disconnect with the goals.
In this definition, we are discussing the assignment written as b = a
or as b = std::move(a)
. The reason why moving might change a
, as
we've discussed, is that a
might contain a resource. Moving indicates
that we do not wish to copy resources that are expensive or impossible
to copy, and that in exchange for this ability, we give up the right to
expect that a
retain its value.
This definition is the correct one to use for reasoning about C++ programs, but it is not directly connected to why you might want to use the feature at all. It is natural that programmers would want to be able to reason about a feature in a way that aligns with its goals.
The goal of this post is to obscure the goal, and to treat
move as if it were a pure optimization of copy, which will not
help a programmer understand why a
's value might change, or why
move-only types like std::unique_ptr
exist.
The explanation of the goal of this operation is reserved in this post for the section entitled "advanced notes for type implementors".
Of course, almost all C++ programmers in a sufficiently large project have to become "type implementors" to understand and maintain custom types, if not to write fresh implementations of them, so I think most professional programmers should be reading these notes, and so I think it's unfair to call them advanced. But beyond that, this explanation is core to why the operation exists, and the only explanation for why move-only types exist, which all C++ programmers will have to use:
For types that are move-only (not copyable), move is C++’s closest current approximation to expressing an object that can be cheaply moved around to different memory addresses, by making at least its value cheap to move around.
He follows up with an acknowledgement that destructive moves are a theoretical possibility:
(Other not-yet-standard proposals to go further in this direction include ones with names like “relocatable” and “destructive move,” but those aren’t standard yet so it’s premature to talk about them.)
For his purposes, this is extremely fair, but since my purposes are to compare C++ to Rust and other programming languages which have destructive moves, it is not premature for me to talk about them.
This gets more interesting in the Q&A.
How can moving from an object not change its state?
For example, moving an int doesn’t change the source’s value because an int is cheap to copy, so move just does the same thing as copy. Copy is always a valid implementation of move if the type didn’t provide anything more efficient.
Indeed, for reasons of consistency and generic programming, move is defined on all types that can be moved or copied, even types that don't implement move differently than copy.
What makes this confusing in C++, however, is that types that manage resources might be written without an implementation of move. They might pre-date the move feature, or their implementor might not have understood move well enough to implement them, or there might be a technical reason why moving couldn't be implemented in a way that elides the resource duplication. For these types, a move falls back on a copy, even if the copy does significant work. This can be surprising to the programmer, and surprises in programming are never good. More direly, there is no warning when this happens, because the notion of resource management is not referenced in the semantics.
In Rust, a move is always implemented by copying the data in the object itself and then not destructing the original object, and never by copying resources managed by the object, or running any custom code.
But what about the “moved-from” state, isn’t it special somehow?
No. The state of a after it has been moved from is the same as the state of a after any other non-const operation. Move is just another non-constfunction that might (or might not) change the value of the source object.
I disagree in practice. For objects that use move as intended, to avoid copying resources, move will (at least usually) drain its resource. This means that an object that often manages a resource will enter a state in which it is not managing a resource. That state is special, because it is the state when a resource-managing object is doing something other than its normal job, and is not managing a resource. This is not a "special state" by any rigorous definition, but is guaranteed to be intuitively special by virtue of being resource-free. (It is also a special state in that the value is unspecified in general, whereas most of the time, the value is specified.)
Collections can, as I said before, get away with becoming the
empty collection in this scenario, but even for those, the empty
state is special: It is the only state that can be represented without
holding a resource. And many other types of objects cannot even
do this. std::unique_ptr
's moved-from state is the null pointer,
and without these move semantics, it would be possible to design a
std::unique_ptr
that did not have a null state.
Once std::unique_ptr
is forced to be allowed to have null values, it
makes sense that there be other ways to create a null std::unique_ptr
,
e.g. by default-constructing it. But it is the design of move semantics
that force it to have a null value in the first place.
Put another way: std::unique_ptr
and thread handles are therefore
collections of 0 or 1 heap allocation handles or thread handles, and
once defined that way, the "empty" state is not special, but it is move
semantics that force them to be defined that way.
Does “but unspecified” mean the object’s invariants might not hold?
No. In C++, an object is valid (meets its invariants) for its entire lifetime, which is from the end of its construction to the start of its destruction.... Moving from an object does not end its lifetime, only destruction does, so moving from an object does not make it invalid or not obey its invariants.
This is true, as discussed above. The moved-from object must be able to be destructed, and there is nothing stopping a programmer for instead doing something else with it. Given that, it must be in some state that its operations can reckon with. But that state is not necessarily one that would be valid if move semantics didn't force its conclusion, and so again, we are close to the problem.
Does “but unspecified” mean the only safe operation on a moved-from object is to call its destructor?
No.
Does “but unspecified” mean the only safe operation on a moved-from object is to call its destructor or to assign it a new value?
No.
Does “but unspecified” sound scary or confusing to average programmers?
It shouldn’t, it’s just a reminder that the value might have changed, that’s all. It isn’t intended to make “moved-from” seem mysterious (it’s not).
I disagree firmly with the answer to the last question. "Unspecified" values are extremely scary, especially to programmers on team projects, because it means that the behavior of the program is subject to arbitrary change, but that change will not be considered breaking.
For example, std::string
does not make any promises about the contents
of a moved-from string. However, a programmer -- even a senior programmer
-- may, instead of consulting the documentation, write a test program
to find out what the value is of a moved-from string. Seeing an empty
string, the programmer might write a
program that relies on the string
being empty:
std::vector<std::string>
split_into_chunks(const std::string &in) {
int count = 0;
std::vector<std::string> res;
std::string acc;
for (char c: in) {
if (count == 4) {
res.push_back(std::move(acc));
// Don't need to clear string.
// I checked and it's empty.
count = 0;
}
acc += c;
}
}
Of course, you should not do that. A later version of std::string
might implement the small string optimization, where strings of
below a certain size are not stored in an expensive-to-copy heap
resource, but in the actual object itself. In that situation, it would
be reasonable to implement move as a copy, which is allowed, and then
this program would no longer do the same thing.
But this is a surprise. This is a result of the "unspecified value." And so while it may, strictly speaking, be "safe" to do things with a moved-from object other than destruct them or assign to them, in practice, without documentation to the contrary making stronger guarantees, the only way to get "not surprising" behavior is to greatly limit what you do with moved-from objects.
What about objects that aren’t safe to be used normally after being moved from?
They are buggy....
By this definition, std::unique_ptr
should likely be considered buggy,
as null pointers cannot be used "normally". Similarly, a std::thread
object that does not represent a thread handle. It is only by stretching
the definition of "used normally" to include these special "empty values"
that std::unique_ptr
gets to claim to not be buggy under that definition,
although a null pointer simply cannot be used the way a normal pointer
can.
Again, this attitude, that a null pointer is a normal pointer, that an
empty thread handle is a normal type of thread handle, is adaptive to
programming C++. But it will inevitably exist in a programmer's blind
spot, as null pointers always have. The "not null" invariant is often
expressed implicitly. Many uses of std::unique_ptr
are relying on
them never being null, and simply leave this up to the programmer
to ensure.
Herb Sutter himself discusses this:
Since the problem is that we are not expressing the “not null” invariant, we should express that by construction — one way is to make the pointer member a
gsl::not_null<>
(see for example the Microsoft GSL implementation) which is copyable but not movable or default-constructible.
In a programming language with destructive moves, it would be possible to have a smart pointer that was both "non-null" and movable. If we need both movability and the ability to express this invariant in the type system, well, C++ cannot help us.
But what about a third option, that the class intends (and documents) that you just shouldn’t call operator< on a moved-from object… that’s a hard-to-use class, but that doesn’t necessarily make it a buggy class, does it?
Yes, in my view it does make it a buggy class that shouldn’t pass code review.
But in a sense, this is exactly what std::unique_ptr
is. It has a
special state where you cannot call its most important operator, the
dereference operator. It only avoids being called buggy because it
expands this state so it can be arrived at by other means.
Again, everything Herb Sutter says is true in a strict sense. It is
memory-safe to use moved-from objects other than to destroy or
assign to them, even if the move operation makes no further guarantees.
It simply isn't safe in a broader sense, in that it will have surprising,
changeable behavior. It is true that the null pointer is a valid
value of std::unique_ptr
, but smart pointers that implement move
are forced to have such a value.
And therefore, it should not be surprising that these questions come up. The misconceptions that Herb Sutter is addressing are an unfortunate consequence of the dissonance between the strict semantics of the programming language, where his statements are true, and the practical implications of how these features are used and are intended to be used, where the situation is more complicated.
Moves in Rust
So the natural follow-up question is, how does Rust handle move semantics?
First off, as mentioned before, Rust makes a special case for types
that do not need move semantics, where the value itself contains all
the information necessary to represent it, where no heap allocations
or resources are managed by the value, types like i32
. These types
implement the special Copy
trait, because for these types, copying
is cheap, and is the default way to pass to functions or to handle
assignments:
#![allow(unused)] fn main() { fn foo(bar: i32) { // Implementation } let var: i32 = 3; foo(var); // copy foo(var); // copy foo(var); // copy }
For types that are not Copy
, such as String
, the default function
call uses move semantics. In Rust, when a variable is moved from, that
variable's lifetime ends early. The move replaces the destructor call
at the end of the block, at compile time, which means it's a compile
time error to write the equivalent code for String
:
#![allow(unused)] fn main() { fn foo(bar: String) { // Implementation } let var: String = "Hi".to_string(); foo(var); // Move foo(var); // Compile-Time Error foo(var); // Compile-Time Error }
Copy
is a trait, but more entwined with the compiler than most traits.
Unlike most traits, you can't implement it by hand, but only by deriving
from primitive types that implement copy. Types like Box
, that manage
a heap allocation, do not implement copy, and therefore structs that
contain Box
also cannot.
This is already an advantage to Rust. C++ pretends that all types are the
same, even though they require different usage patterns in practice. You
can pass a std::string
by copy just like an int
. Even if you have a
vector of vectors of strings, you can pass by copy and that's usually the
default way to pass it -- moves in many cases require explicit opt-in. For
int
it's a reasonable default, but for collections types it isn't,
and in Rust the programming language is designed accordingly.
If you want a deep copy, you can always explicitly ask for it with
.clone()
:
#![allow(unused)] fn main() { fn foo(bar: String) { // Implementation } let var: String = "Hi".to_string(); foo(var.clone()); // Copy foo(var.clone()); // Copy foo(var); // Move }
What this actually does is create a clone, or a deep copy, and then
move the clone, as foo
takes its parameter by move, the default for
non-Copy
types.
What does a move in Rust actually entail? C++ implements moves with custom-written move constructors, which collections and other resource-managing types have to implement in addition to implementing copying (though automatic implementation is available if building out of other movable types). Rust requires implementations for clone, but for all moves, the implementation is the same: copy the memory in the value itself, and don't call the destructor on the original value. And in Rust, all types are movable with this exact implementation -- non-movable types don't exist (though non-movable values do). The bytes encode information -- such as a pointer -- about the resource that the value is managing, and they must accomplish that in the new location just as well as they did in the old location.
C++ can't do that, because in C++, the implementation of move has to mark the moved-from value as no longer containing the resource. How this marking works depends on the details of the type.
But even if C++ implemented destructive moves, some sort of "move constructor" or custom move implementation would still be required. C++, unlike Rust, does not require that the bytes contained in an object mean the same thing in any arbitrary location. The object could contain a reference to itself, or to part of itself, that would be invalidated by moving it. Or, there could be a data structure somewhere with a reference to it, that would need to be updated. C++ would have to give types an opportunity to address such things.
Safe Rust forbids these things. The lifetime of a value takes moves into
account; you can't move from a value unless there are no references
to it. And in safe Rust, there is no way for the user to create a
self-referential value (though the compiler can in its implementation
of async
-- but only if the value is already "pinned," which we will
discuss in a moment).
But even in unsafe Rust, such things violate the principle of move.
Moving is always safe, and unsafe Rust is always responsible for keeping
safe code safe. As a result, Rust has a mechanism called "pinning" that
indicates, in the type system, that a particular value will never move
again, which can be used to implement self-referential values and which
is used in async
. The details are beyond the scope of this blog post,
but it does mean that Rust can avoid the issue of move semantics for
non-movable values without ruining the simplicity of its move semantics.
For these rare circumstances, the features of moving can be accomplished
by indirection, and using a Box
that points to a pinned value on
the heap. And there is nothing stopping such types from implementing a
custom function which effectively implements a custom move by consuming
the pinned value, and outputs a new value, which can then be pinned
in a different location. There is no need to muddy the built-in move
operation with such semantics.
Practical Implications for C++ Programmers
So, obviously, in light of my blog series, I recommend using Rust
over C++. For Rust users, I hope this clarifies why the move semantics
are the way they are, and why the Copy
trait exists and is so important.
But of course, not everyone has the choice of using Rust. There are a lot of large, mature C++ codebases that are well-tested and not going away anytime soon, and many programmers working on those codebases. For these programmers, here is some advice for the footgun that is C++ move semantics, both based on what we've discussed, and a few gotchas that were out of the scope of this post:
- Learn the difference between rvalue, lvalue, and forwarding references. Learn the rules for how passing by value works in modern C++. These topics are out of the scope of this blog post, but they are core parts of C++ move semantics and especially how overloading is handled in situations where moves are possible. Scott Meyers's Effective Modern C++ is an excellent resource.
- Move constructors and assignment operators should always be
noexcept
. Otherwise,std::vector
and many other library utilities will simply ignore them. There is no warning for this. - The only sane things to do with most moved-from objects are to immediately destroy it or reset its value. Comment about this in your code! If the class specifically defines that moved-from values are empty or null, note that in a comment too, so that programmers don't get the impression that there are any guarantees about moved-from values in general.
Conclusion
Move semantics are essential to the performance of modern C++. Without them, much of its standard library would become much more difficult to use. However, the specific design of moves in C++:
- is misaligned with the purpose of moving
- fails to eliminate all run-time cost
- surprises programmers, and
- forces designers of types to implement an "empty-yet-valid" state
Why, then, does C++ use such a definition? Well, C++ was not originally designed with move semantics in mind. Proposals to add destructive move do not interact well with the existing language semantics. One interesting blog post that I found even says, when following through on the consequences of adding destructive move semantics:
... if you try to statically detect such situations, you end up with Rust.
C++ has so many unsafe features and so many existing mechanisms, that this was deemed the most reasonable way to add move semantics to C++, harmful as it is.
And perhaps this decision was unnecessary. Perhaps there was a way -- perhaps there still is a way -- to add destructive moves to C++. But for right now, non-destructive moves are the ones the maintainers of C++ have decided on. And even if destructive moves were added, it's unlikely that they'd be as clean as the Rust version, and the existing non-destructive moves would still have to be supported for backwards-compatibility sake.
In any case, Rust has taken this opportunity to learn from existing programming languages, and to solve the same problems in a cleaner, more principled way. And so, for the move semantics as well as for the syntax, I recommend Rust over C++.
And to be clear, this still has very little to do with the safety features
of Rust. A more C++-style language with no unsafe
keyword and no safety
guarantees could have still gone the Rust way, or something similar to
it. Rust is not just a safer alternative to C++, but, as I continue to
argue, unsafe Rust is a better unsafe language than C++.
Entries
In this chapter, I will be discussing a specific data structure API: the Rust map API. Maps are often one of the more awkward parts of a collections library, and the Rust map API is top-notch, especially its entry API -- I literally squealed when I first learned about entries.
And as we shall discuss, this isn't just because Rust made better choices than other standard libraries when designing the maps API. Even more so, it's because the Rust programming language provides features that better expresses the concepts involved in querying and mutating maps. Therefore, this chapter is properly included in this book: this discussion serves as a window into some deep differences between C++ and Rust that show why Rust is better.
And for this chapter, specifically, we'll also be discussing Java, so this will be a three-way comparison, between Java, C++ and Rust.
Reading from a Map
So, let's talk about map APIs. But before we get to Entry
and friends,
let's discuss something a little simpler: getting an item from a
map. Let's say we have a sorted map of strings to integers:
- In Java,
TreeMap<String, Integer>
- In C++,
std::map<std::string, int>
- In Rust,
BTreeMap<&str, i32>
Let's also say we have a string "foo"
, and want to know what integer
corresponds to it. Now, if we're always sure that the string we're
looking up is always in the map, then we know what we want: we want
to get an integer.
But what if we're not sure? There are plenty of situations where we want to read a value corresponding to the key -- or do something else when that key is not present. Maybe the value is a count, and an absent key means 0. Or maybe the absent key means that the user has made a typo, and needs to be informed. Or maybe the map is a cache, and the absent key means we need to read a file or query a database. In all of these cases, we need to know either the value, or the fact that the key is absent.
Let's see how this is handled in our three programming languages, and how fundamental design choices in these programming languages lead to such APIs.
Java get
a (Nullable) Reference
A long time ago, Java made an extreme choice in the name of simplicity:
It divided all values into a dichotomy of "primitives" and "objects."
Primitives are passed around by implicit copy, whereas objects are
aliased through many mutable references. Objects always have optionality
built in -- any object reference is automatically "nullable," which
means you can store the special sentinal/invalid value null
in it,
the interpretation of which varies wildly. Primitives are not optional
in this way.
Also for the sake of simplicity, and very relevantly to the topic at hand,
generics are only supported for object types, not primitives. That means
that map values can only ever be object types. And that means that our
map from strings to integers in Java doesn't use Java's primitive integer
type int
, but rather this special wrapper/adapter type Integer
,
which auto-casts to and from int
, and which, like any object type,
is managed through mutable, nullable references. (At this point, I
for one am beginning to suspect they missed the mark on their simplicity).
So what's that mean for our map? How do we find out what value
corresponds to "foo"
in our map, or else that there is none?
Well, the method for this is called get
, and that returns the
value in question if there is one. And when there isn't? Well,
Java here leverages nullability, and returns null
when there
is no value.
So we can write something like this:
Integer value = map.get("foo");
if (value == null) {
System.out.println("No value for foo");
} else {
int i_value = value;
System.out.println("Value for foo was: " + i_value);
}
So far, so good. But there are problems. And perhaps I'm missing some
-- now is a good time to take a second, look at the code, and try to
imagine in your mind what problems there may be with this system (you
know, besides the fact that I have to use i_
as improvized Hungarian
notation due to lack of support in Java for shadowing).
You have some? I'll now list what I've got.
Problem the first: The signature of get
doesn't really alert
us to the possibility of a value not being in a map. This is
the sort of "edge case" that programmers regularly forget to handle;
a programmer may know, due to their situation-specific knowledge,
that the key ought to be present, and forget to consider that the
key might not be.
Compilers of strongly typed languages generally work to ensure that
programmers don't miss edge cases like this, don't make simple "thinkos"
(typos but with thought)
or "stupid mistakes." How's Java hold up? Well, remember how we mentioned
that primitives can't be null
, but these wrapper types like Integer
are coercible to primitives? Well, this compiles without a word of
complaint from the compiler:
TreeMap<String, Integer> map = new TreeMap<String, Integer>();
map.put("foo", 3);
int foo = map.get("foo");
System.out.println("int foo: " + foo);
int bar = map.get("bar");
System.out.println("int bar: " + bar);
And what happens at run-time? Similar behavior to Rust's infamous
unwrap
function. The conversion from the nullable Integer
and the non-nullable int
crashes when the Integer
is in
fact null
:
int foo: 3
Exception in thread "main" java.lang.NullPointerException: Cannot invoke "java.lang.Integer.intValue()" because the return value of "java.util.TreeMap.get(Object)" is null
at test.main(test.java:12)
So you might try to fix this by querying if the key exists first:
TreeMap<String, Integer> map = new TreeMap<String, Integer>();
if (map.containsKey("bar")) {
int bar = map.get("bar");
System.out.println("int bar: " + bar);
} else {
System.out.println("bar not present");
}
But now we've reached problem the second. Unfortunately, even though
this looks like it addresses the issue, this won't prevent the crash
either. There is nothing stopping you from putting a null
into the map,
so this code also crashes given the right context:
TreeMap<String, Integer> map = new TreeMap<String, Integer>();
map.put("bar", null);
if (map.containsKey("bar")) {
int bar = map.get("bar");
System.out.println("int bar: " + bar);
} else {
System.out.println("bar not present");
}
So for a given key in a Java map, there are actually three possible situations:
- The key is absent.
- The key corresponds to an integer.
- The key corresponds to one of these special
null
-values.
get
can distinguish 2 from 1 and 3, but cannot distinguish between
1 and 3. containsKey
can distinguish 1 from 2 and 3, but cannot
distinguish 2 from 3. To distinguish all 3 scenarios, and handle
all the representable values, you need to call both get
and containsKey
:
if (map.containsKey("bar")) {
Integer bar = map.get("bar");
if (bar == null) {
System.out.println("bar present and null");
} else {
int i_bar = map.get("bar");
System.out.println("int bar: " + i_bar);
}
} else {
System.out.println("bar not present");
}
In addition to this precaution not being enforced to the compiler,
it leads to problem the third: We are now querying the map twice.
We are walking the tree twice with our containsKey
followed by
get
.
At this point, we find ourselves scrolling through the Map
methods
in Java's documentation, trying to find a more general solution. getOrDefault
might
help in some situations -- when there's a value that makes sense as the
default. compute
might be useful -- if we're OK with modifying
the map in the process.
But in general, nothing clean exists to tidy up these problems. And the blame lies squarely on Java's decision to make almost all types -- and all types that can be map values -- nullable.
But wait! -- you might object -- Can't we just maintain an invariant on the
map that it contains no null
values? If we have a map without null
values, all these issues -- well, many of these issues -- dry up.
And this is true. Maintaining such an invariant makes for a much cleaner situation. Pretend you aren't allowed to put nulls in maps, and arrange not to do it.
But, first off, maintaining an invariant like this is easier said than done. Programmers often do this sort of thing implicitly in their head, but it's much better to comment. Either way, you have to trust future programmers -- even future versions of the same programmers -- to know about the invariant, either by intuiting it (all too common) or by reading the relevant comment (which, even if there is one, might not happen). And you have to trust them to not intentionally violate the invariant, and also to not accidentally violate the invariant: Are they sure that all those values they add to the map can never be null?
And second off, somewhat shockingly, sometimes people do assign special
meanings to null
. I said before null
has a wide range of meanings,
and it's not uncommon to use null
to mean special things. Maybe
"not mapped" means "load from cache," but "null" means "there actually
is no value and we know it." Or maybe the opposite convention applies.
null
is frustratingly without intrinsic meaning.
For such situations, programmers should probably compose the map with other types or better yet, write custom types that make the semantics of these situations abundantly clear. But let's not put all the blame on the programmers. If Java had really wanted to protect people from distinguishing these "not mapped" and "mapped to null" situations, Java maps shouldn't have made the distinction representable at all. It's bad programming language design to put features in a library that can only be abused, and it's bad understanding of human nature to then solely blame the programmers for misusing them.
C++: No Nulls No More
So now we move on to C++.
In C++, fewer types are nullable, and non-nullable types like int
can be used as the value type of a map. For our map, of type
std::map<std::string, int>
, we no longer have the trichotomy of
"key not present, value null, or value non-null," but the much more
reasonable dichotomy of either the key is present and there is an int
,
or it's absent and there isn't one.
This is, in my mind, the bare minimum a strongly typed language should be able to provide, but after the context of Java it's worth pointing out.
There are three (3) methods in C++ that look like they might be usable
as a get
operation, an operation where we either get an int
value
or learn that the key is absent:
See if you can identify which one is the right one to use.
Spoiler alert! It's find
, the one whose name superficially looks least like
it'll be the right one. at
throws an exception if the key is absent,
and operator[]
, the one with the most appealing name, is an eldritch
abhomination which we'll discuss and condemn later.
But all well-deserved teasing aside, find
is much better than
Java's get
. It returns a special object -- an iterator -- that
can be easily tested to see whether we've found an int
, and easily
probed to extract the int
.
auto it = map.find(key);
if (it == map.end()) {
std::cout << key << " not present" << std::endl;
} else {
std::cout << key << " " << it->second << std::endl;
}
This is actually pretty good! The ->
operator also serves as a signal
to experienced C++ programmers that we're assuming that it
is valid:
generally ->
or *
means that the object being operated on is
"nullable" in some way.
So when a C++ programmer reads something like this, they have a little bit of warning that they're doing something that might crash:
int foo = map.find(key)->second;
And certainly, they have more warning than the Java programmer with the equivalent Java:
int foo = map.get(foo);
Of course, this is awkward. find
returns an iterator, which isn't
exactly the type we'd expect for this "optional value" situation. And
to determine if the value isn't present, we compare it to map.end()
,
which is a weird value to compare it to. Nothing about what these things
are named is specifically intuitive, and people would be forgiven for
using the accursed operator[]
. map["foo"]
just looks like an
expression for doing boring map indexing, doesn't it?
And what does operator[]
do, if the key isn't present? It inserts the
key, with a default-constructed value. No configuration is possible of
what value gets inserted, short of defining a new type for the object
values. This is sometimes what you want -- like if your value type has a
good default (especially if you defined it yourself), or if you're about
to overwrite the value anyway. But in most cases, you want some other
behavior if the value is not present -- operator[]
doesn't really tell
you that it inserted the item, so if you need to make a network query
or read a file or print an error, you're out of luck. operator[]
,
as innocuous as it looks, has surprising behavior, and that is not good.
But all in all, as far as getting values goes, as far as querying the map goes, C++ is doing OK. Solid B result on this exam, I think. Decent work, C++. Especially since we just looked at Java.
The Rust Option
So now on to Rust: we want to query our BTreeMap<&str, i32>
.
(Or... it might be a BTreeMap<String, i32>
, depending on whether we
want to own the strings. This is a decision we also have to make in C++
(where we could have used string_view
s as the keys), but do not have
to make in Java. At least in Rust, we know that whichever decision we
make, we will not accidentally introduce undefined behavior. But that's
a distraction!)
So let's apply the same test to Rust as we've applied before.
Here, the method in
question is given an obvious name, get
rather than find
. So let's
see how it does in our test, of allowing us to read a value if present,
but know if not:
if let Some(val) = map.get(key) {
println!("{key}: {val}");
} else {
println!("{key} not present");
}
See, get
returns an Option
type. Therefore, unlike in C++, we can test
for the presence of the value and extract the value inside the same if
statement. Unlike in C++, the return value of get
isn't a map-specific
type, but rather the completely normal way to express a maybe-present
value in Rust. This means that if we want to implement defaulting, we
get that for free by using the Option
type in Rust, which implements
that already:
// Let's say missing keys means the count is 0:
let value = *map.get("foo").unwrap_or(&0);
Similarly, calling is_none()
or pattern-matching against None
is
much more ergonomic than comparing an iterator to map.end()
. It requires
some more intimate knowledge -- or some follow-up reading -- to learn that
the concept of "end of collection" and "not found" are for various reasons
combined into one in C++.
So while C++ avoids the problematic elements of Java maps, Rust does so
more ergonomically, because it has a well-established Option
type. C++
now has one as well, std::optional
, but it hasn't yet reached its
map
API, because it was only added very recently, in C++17.
And Option
integrates even better than std::optional
with the programming
language, because Option
is just a garden-variety sum type, a
Rust enum
, which lets you do things like if let Some(x) = ...
,
and combine testing and unpacking in the same statement. C++ could
not design a map API this ergonomic, because they lack this fundamental
feature.
Also, unlike with null
in Java, if you want to use Option
as a meaningful distinction in your map, you still can. The get
function would then return Option<Option<...>>
instead of
just Option
-- the outer one representing presence, the inner one
representing whether the value was None
or Some(...)
. Option
is composable in a way that null
is not.
For the record, the Rust equivalent to operator[]
-- the Index
trait implementation on maps -- does the equivalent to C++ at
, and
panics if the key isn't present. While not as generally useful as get
,
I think this is a reasonable interpretation of what map["foo"]
should
mean.
Mutation Station
So Rust wins, I'd say pretty handily, when comparing how to access a value from a map, how to query them. But where Rust truly shines is when mutating a map. For mutation, I'm going to approach the discussion differently. I'm going to start by specifying what use cases might exist, and then, in that context, we can discuss how an API might be built.
The mutation situation has a similar dilemma to querying: the key in question might or might not already be in the map. And, for example, we often want to change the value if the key is present, and insert a fresh value if the key is absent.
Of course, we could always check if the key is present first, and then do something different in these two scenarios. But that has the same problem we already discussed for querying: We then have to iterate the tree twice, or hash the key twice, or in general traverse the container twice:
auto it = map.find(key); // first traversal
if (it != map.end()) {
return it->second;
} else {
int res = load_from_file(key);
map.insert(std::pair{key, res}); // second traversal
return res;
}
So what should we do for our API for this scenario, where we want to change the value if the key is present, and insert a fresh value if the key is absent?
Well, sometimes that fresh value is a default value,
like if we're counting and the key is the thing we're counting -- in that
case, we can always insert 0. In that case, C++'s operator[]
-- when
combined with an appropriate default constructor -- can actually
work well.
And sometimes, that fresh value depends on the key, like if the value is a
more complicated record of many data points about the item in question.
If the value is a sophisticated OOP-style "object," and the key indexes
one of the fields also contained in the value, C++'s operator[]
would
not work. The default value is a function of the key.
And sometimes, there isn't a default value per se. Sometimes, if the key is absent, we need to do additional work to find out what value should be inserted. This is the case if the map is a cache of some database, accessed via IPC or file or even Internet. In that situation, we only want to send a query if the key is not present. We would not be able to accomplish our goals simply provide a default value when sending the mutation operation.
C++ doesn't have anything for us here. operator[]
is pretty much its most sophisticated "query-and-mutate"
operation. Java, somewhat surprisingly, does have something relevant,
compute
.
This handles all of these situations, with a relatively unergonomic
callback function -- and as long as your map never contains null
s.
Rust's solution, however, is to create a value that encapsulates
being at a key in the map that might or might not have a value
associated with it, a value of the
Entry
type.
As long as you have that value, the borrow checker prevents you from
modifying the map and potentially invalidating it. And as long
as you have it, you can query which situation you're in -- the
missing key or the present key. You can update a present key. You can
compute a default for the missing key, either by providing the value or
providing a function to generate it. There are many options, and you can
read all of them in the Entry
documentation; the world is your oyster.
So the C++ code above can be ergonomically expressed as something like this in Rust:
let entry = map.entry(key.to_string());
*entry.or_insert_with(|| load_from_file(key))
And the idiom where we're counting something could be expressed something like:
map.entry(string)
.and_modify(|v| *v += 1)
.or_insert(1);
So we get this nice little program that counts how many times we use different command line arguments:
use std::collections::BTreeMap;
use std::env;
fn count_strings(strings: Vec<String>) -> BTreeMap<String, u32> {
let mut map = BTreeMap::new();
for string in strings {
map.entry(string)
.and_modify(|v| *v += 1)
.or_insert(1);
}
map
}
fn main() {
for (string, count) in count_strings(env::args().collect()) {
println!("{string} shows up {count} times");
}
}
Conclusion
So first off, Entry
s are super nice, and neither Java nor C++ has
anything anywhere near as nice. Even when it comes to just querying,
Rust's get
is much better than Java's get
, and a little more ergonomic
than C++'s find
.
But this isn't an accident. This isn't just about Rust's map API having
a nice touch. When we look at the definition of Entry
,
we see things that Java and C++ can't do:
pub enum Entry<'a, K, V>
where
K: 'a,
V: 'a,
{
Vacant(VacantEntry<'a, K, V>),
Occupied(OccupiedEntry<'a, K, V>),
}
First, this is an enum
: There's two options, and in both option,
there's additional information. Of course, Java and C++ can express
a dichotomy between two options, but it's a lot clumsier. Either you'd
have to use a class hierarchy, or std::variant
, or something else. In
Rust, this is as easy as pie, and since it does it the easy way, you can
not only use the various combinator methods in Rust, you can also use
Entry
s with a good old-fashioned match
or if let
to distinguish
between the Vacant
and Occupied
situation.
Second, there's a little lifetime annotation there: 'a
. This is
an indication that while you have an Entry
into a map, Rust won't
let you change it. Now, in Java and C++, there's also iterators,
which you may not change a map while you're holding, but in both
those languages, you have to enforce that constraint yourself.
In Rust, the compiler can enforce it for you, making Entry
s
impossible to use wrong in this way.
Without both of these features, Entry
would not have been an obvious API
to create. It would've been barely possible. But Rust's feature set encourages
things like Entry
, which is yet another reason to prefer Rust over C++
(and Java): Rust has enum
s (and lifetimes) and uses them to good effect.
Addendum
I wanted to address a few points that people have raised in comments since I posted this.
Some people have pointed out that C++ has insert_or_assign
,
but in spite of the promising name, it just unconditionally sets a key
to be associated with a value, whether or not it previously
was. This is not the same as behaving differently based on
whether a value previously existed, and it is therefore not
relevant to our discussion.
More interestingly, it has been pointed out to me that with
the return value of insert
, you can tell whether the insert
actually insert
ed anything, and also get an iterator to the entry
that existed before if it didn't. This allows implementing some, but not
all, of the patterns of Entry
without traversing the map twice.
For example, counting:
int main(int argc, char **argv) {
std::vector<std::string> args{argv, argv + argc};
std::map<std::string, int> counts;
for (const auto &arg : args) {
counts.insert(std::pair{arg, 0}).first->second += 1;
}
for (const auto &pair : counts) {
std::cout << pair.first << ": " << pair.second << std::endl;
}
return 0;
}
This works, but is much less clear and ergonomic than the Entry
-based
API. But perhaps more importantly, this functionality is much more
constrained than Entry
, and is equivalent to using Entry
with just
or_insert
, and never using any of the other methods. As another
commentator pointed out, counting is possible with just or_insert
:
*map.entry(key).or_insert(0) += 1
But counting is just one example. C++'s insert
is still deeply
limited. Using C++'s insert
means you have to know a priori what
value you would be inserting. You can't use it to notice that a key is
missing and then go off and do other work to figure out what the value
should be. So you can't do my load_from_file
example.
In order to do the load_from_file
example in C++, even with this use of
insert
, you would have to temporarily insert some sentinal value in the
map -- and that goes against how strongly typed languages ought to work,
in addition to breaking the C++ concept of exception safety.
This is, as was pointed out in another comment, exactly what C++ programmers sometimes have to do, to meet performance goals, at the expense of clarity and simplicity, and therefore, especially in C++, at the expense of confidence in safety and correctness.
Safety and Performance
There is a persistent and persnickety little argument that I want to talk specifically about. This argument is really persuasive on its face, and so I think it deserves some attention -- especially since I am guilty of having used this argument myself, many years ago when I still worked at an HFT firm, to claim that C++ had a niche that Rust wasn't ready for. I've also seen it a few times in a row in the wild, and it's made me so emotional that I simply had to write this, and as a result, it's a little more emotional than some of the other posts.
In this argument, array indexing stands in for a number of little features. But -- I've seen array indexing cited so often as a canonical example that I feel compelled to address it directly!
The argument goes like this:
In Rust, array accesses are checked. Every time you write
arr[i]
, there is an extra prependedif i >= arr.len() { panic!(..) }
. As you can see, that is more code, and worse, a run-time check. And while the optimizer might eliminate it, or the branch predictor may well predict it right every time, the extra code bloat and possible run-time check, is just unacceptable in [insert field here (I used HFT)], where every nanosecond matters. And until some acceptable solution is found to this, I just don't see Rust making it in [insert field].
When I made this argument, to a group of programming-language academics, the defenders of Rust countered with a number of points, all of which accepted the basic premise:
- Do I really need those extra nanoseconds? Yes.
- Is it really too much of a price to pay for all that extra safety? Yes.
- Do I really distrust the optimizer that much? Yes. If only Rust had a way to do optimizer assertions, a way to statically verify that the panic had been optimized out.
- Would dependent typing on integer values help? Yes. That sounds very promising. I think Rust will get there someday, but for right now we must use C++.
Now that I know more about Rust I'm happy to tell you that I was completely off base. I wasn't off base about the performance considerations, or the unacceptability of even the slightest risk of a run-time check. I was off base about an even more basic premise: that Rust uses checked array indexing, whereas C++ uses unchecked array indexing.
But wait! Isn't that the whole point? Doesn't C++ avoid checking everything, to make sure all abstractions are zero-cost, to be blazing fast? Doesn't Rust, while trying for performance, in the end always concede to the demands of safety?
Well, let's look at the APIs in question. C++ apologists are always
saying to use the modern C++ features from C++11 and later,
rather than the more C-like "old style" C++ features, so on the
C++ side let's take a look at the
documentation
for std::array
, introduced in C++11.
Here we see two indexing methods. The first one, at
, is bounds
checked and will throw an exception if the index is out of bounds,
whereas the second one, operator[]
, is not, and will instead exhibit
undefined behavior of a very difficult-to-debug nature. It looks like C++
actually believes in free choice here, leaving the choice of method up
to the user. Not quite what we supposed, but the important part is that
unchecked indexing is available, so so far the argument can still stand.
Now let's look at Rust. Rust arrays and vectors can also be used with
methods from slice,
as can slices, so the slice documentation is the best place to look.
And looking there, we immediately see -- drum roll please -- 4 methods. We
see get
and get_mut
, which are checked, and right underneath them,
in alphabetical order, get_unchecked
and get_unchecked_mut
, which
are not.
To review, where do Rust and C++, these programming languages with their vastly different philosophies, Rust for the cautious, C++ for the fast and bold, stand? In the exact same place. Both programming languages have both checked and unchecked indexing.
Let me say that again. This is the talking point form, what to say if you need something quick to say, if you're ever debating programming languages on a political-style talk show (or at a party or even a job interview):
In both Rust and C++, there is a method for checked array indexing, and a method for unchecked array indexing. The languages actually agree on this issue. They only disagree about which version gets to be spelled with brackets.
The difference is simply in the default, which one gets
that old fashioned arr[index]
syntax. And even that can be
changed.
Even if the C++ default were superior -- and, as I will argue later,
it is not -- this is surely a minor issue. After all, don't we normally
use our fancy for x in arr
syntax in Rust? This issue is just so small
as to be unlikely to be a deciding factor in what programming language
is better, even if we're in a special application domain where every
nanosecond matters.
The Unsafe Keyword
So that's a wrap folks. We can all go home, and none of us will ever see this extremely silly argument on the Internet or in person again. It's just a misunderstanding, the person making it was simply misinformed, and all it will take is a link to this blog post -- or the relevant method in the docs to set them straight.
But wait! The C++ apologists are still talking! What are they saying? How have they not been completely flummoxed? They're pointing at that method, chanting a word like a slogan at a protest march. I can't quite make it out -- what it is it?
Oh. They're chanting unsafe
. And credit where credit is due:
it's very difficult to chant in a monospace font.
Well, that is easy to respond with! The nerve, that C++ programmers would call our unchecked array indexing method unsafe. For one, all unchecked array indexing methods are unsafe: that's what unchecked means. If it were safe, it would be at least statically checked. For another, isn't this the pot calling the kettle black? Isn't C++ all about unsafety, so much that C++ programmers don't even mark their unsafe code regions becasue it all is, or their unsafe functions because they all are?
"But isn't that the whole point of Rust?" they cry. "If you have to
use unsafe
to write good Rust, then Rust isn't a safe language
after all! It's a cute effort, but it's failing at its purpose!
Might as well use C++ like a Real Programmer!"
This, my friends, is a straw man. No, the point of Rust and specifically Rust's memory safety features is not to create an entirely safe programming language that can't be circumvented in any circumstance; you must be thinking of Sing#, the programming language for Microsoft's defunct research OS.
Let me be abundantly clear: The point of memory safety, the unsafe keyword, and friends in Rust is not to completely enforce memory safety, to make it impossible for the programmer to do anything they want to with the computer, even if they can't prove to the compiler that it's OK. In fact, the point of memory safety isn't to make it impossible to do anything at all -- it's to make it possible to reason about the program.
The premise of Rust is that the vast majority of code in a systems program doesn't need to be unsafe, and so it might as well be safe. People used to believe that you needed garbage collection for safety, but Rust proved that you could use lifetimes to still get safety without that performance cost. Now that we're there, why worry about null pointers? Why not tell the compiler which things can be null, and which things can't, so the compiler can check for you whether you're handling nulls correctly? I've programmed C++ professionally for years without such a feature. You'd better believe I would have totally annotated the crap out of the code so the compiler could've caught them ahead of time.
Sometimes, C++ apologists cite valgrind. I've had codebases where
I tried to use valgrind
. Unfortunately, there was so much undefined
behavior and memory leaks already caked into this project that new
ones were simply impossible to see among all the noise. An army
of junior engineers was at some point required to clean this up
when finally the hierarcy decided that "valgrind" was something we
might want to be able to use in the future.
And a lot of those undefined behaviors were ticking time bombs.
Certainly, this codebase had its issues. A friend of mine took days to
find a bug where a pointer had a value of 7. I don't mean 7 elements into
some array, not 7 of the relatively wide pointer type, not a convenient,
testable-for NULL
, value. No, none of that: The pointer's value was
exactly 0x7
.
I've had memory corruption issues where I poured over every line of code that I wrote, over and over again, finding nothing. Ultimately, I learned that the issue was in framework code -- code written by my boss's boss. The code was untested, and written extremely poorly, and had rotted, so that it didn't work at all. In Rust, I might have had some idea that my code -- which in Rust would have all been able to be "safe" -- couldn't possibly be the source of the problem. Maybe my humble assumption that my code was to blame would be a little less tenable.
If I wanted a language that was always safe, at the time I knew Java
or Python existed. Some companies even do finance in Java, for exactly
that reason. But sometimes you still need that extra bit of performance.
unsafe
is sometimes necessary.
But given what gains safe Rust has made in predictable performance, it's not as necessary as it used to be. The majority of the code I wrote then could've been written in safe Rust, and not lost a single clock cycle. The parts that needed to be unsafe could have been isolated, delegated to specific sections, wrapped in abstract data types, perhaps entrusted to a specific team.
And even then, I'm sure we would have been debugging memory corruption issues. But we'd know where to look. We'd know where to throw the tests. And we'd have saved programmer-years of time, days if not months of my life.
Now, I'm proud of my C++ skills. There is some part of me that wishes that C++ was better than Rust, that all that time getting better at debugging memory corruption wasn't dedicated to a skill that is becoming obsolescent through better technology. And to be honest, that's part of why I dismissed Rust as a candidate for HFT programming languages.
But it's possible to be proud of a skill that is also becoming obsolete.
And I am trying to replace it with a new skill to be proud of -- writing
Rust as performant as idiomatic C++, or even more performant, while
reaching for the unsafe
keyword rarely and modularly. I think it's truly
possible, for where it's relevant.
Now I must turn to a subset of C++ apologists, who write using "modern C++" which is "very safe now" and experience therefore no memory corruption issues. To them I say, you are not doing high performance programming. If you were, you'd have to do some wonky things with pointers to spell the bespoke high-performance constructs you'd need.
There is indeed a safe subset of C++ heavy with modern features. If
you are disciplined and keep your programming in that realm, you can
avoid memory corruption mostly. But first, this safe subset covers fewer
high-performance features than Rust. I've read some of this code and its
idioms: It's full of shared_ptr
s not to share ownership but simply to
avoid types that might be invalidated. It ironically leans on reference
counting more than idiomatic Rust. This is among other, similar problems.
Let me be clear: First off, instead of keeping in your brain which features are "modern" and which are "edgy," why not have a distinction where it's well-marked? Second off, if you are writing entirely in this safe subset of C++, you can get much better performance instead out of the safe subset of Rust. You have no right to complain about Rust's safety trade-offs, as you're using a worse set, where you get no safety promises from the compiler and none of Rust's surprising safe performance.
Rust's safe and "slow" subset is faster than C++'s while still being, obviously, safer. Rust's unsafe subset is better factored and better distinguished. Comparing apples to apples, Rust is better programming language for extracting performance out of LLVM, because you'll be able to code more often without fear, and with very focussed fear when you do feel it.
A tool is even more useful if you can adjust it. The defenders
of C++ talk about choosing trade-offs, but really, Rust offers both
trade-offs. Mark your code as unsafe
and convince yourself of its
safety manually, or rely on programming language features. It's up to
you, on a function-by-function, even block-by-block, basis. In C++,
if you have a problem, every line of code is suspect; you simply
can't opt in to safety, but in Rust, for where you don't need the
performance of unchecked indexing and other unsafe features, you can
relax about the possibility of going bankrupt due to inadvertent memory
reinterpretation --
and how do I wish my NDA permitted me to talk about consequences at my own
previous jobs!
And for where you do need to use unsafe
, you can make sure your
debugging and overthinking efforts are well-directed, for the few places
in a large project you need it.
Unchecked Indices
This has gotten a little far from the original question. Should array indices be checked? Well, let me be clear about two facts that are both true, but in tension with each other:
- Unchecked array indexing is sometimes absolutely necessary
- Unchecked array indexing is an edge-case feature, which you normally don't want.
If unchecked array indexing was unavailable in Rust, that would be a bug.
What is not a bug is making it inconvenient. C++ programmers probably
should be using at
instead of operator[]
more often. But in C++,
what would it gain? There's so many unsafe features, what's the cost
of one more?
But in Rust, where so much code can be written that's completely safe, defaulting to the safe version makes more sense. Lack of safety is a cost too, and Rust makes that cost explicit. Isn't that the goal of C++, making costs explicit?
Let's look at situations where you are indexing memory. First off, most
of them I saw were in old C-style for
-loops, where you loop over an
index rather than using iterators directly with a collection. Both Rust
and C++ have safe versions of for
that loop over collections with
iterators, and those use the same check for the loop as they do for
bounds, so those are easy enough to address. Nevertheless, I think that
a lot of the noise about checked vs. unchecked array accesses comes from
people who use indexing for their for
-loops instead of iterators,
and therefore mistakenly think that array indexing in general is a
far more common operation than it is.
For the remaining situations, most are implementing either gnarly business logic, or a subtle, fast algorithm.
If it's gnarly business logic, in my experience, it's usually at config time -- along with a good third to half to even more of the code in a complicated production system.
What do I mean by config time? A running high-performance system, whether optimized for latency or throughput, has a bunch of data structures organized just so, a lot of threads set up just right to move data between them in the perfect rhythm, and a lot of the work is in arranging them. That work is generally not performance-sensitive, but often has to be in the same programming language as the performance-intensive stuff.
Config-time is, depending on how you look at it, less of a thing or the entire thing in a programming language like Python. Python basically exists to do config-time programming for performance-intensive code put in very comprehensive "libraries" written in C or C++. But in C++, where you have a constructor that runs only once or a few times at first, and other methods related to it, in the same programming language as the money-making do-it part, you have to really adjust programming style between them.
Config-time is obviously when you read the configuration files.
It's where you open the relevant files. It's where you call socket
and bind
and listen
on your listening port. It's where you spin up
your worker threads, and make computations on how many worker threads
there are. It's where you construct your objects and your object pools.
It's where you memory map your log file. It's where you set your process
priorities. It's where you recursively call the constructors and init
functions of every object in your overwrought OOP hierarchy.
There is no need to sacrifice safety for performance at config time -- especially since undefined behavior might lie latent and destabilize the system once it's actually up and running. If you do an unchecked array access at config time, you might put garbage data in an important field, maybe one that determines how much money you're willing to risk that day or how many of a thing to buy. And for what? To save a few nanoseconds before your process has even "gone live"?
So, when do you truly need unchecked array accesses? If it's a subtle
fast algorithm, probably deep in an inner loop, you should probably be
wrapping it in an abstraction anyway. The code that actually executes the
algorithm should be separate from the business logic, so that programmers
trying to maintain the business logic don't accidentally break it. And
that's exactly where it makes the most sense to use unsafe
-- when
implementing a special algorithm. Maybe the proof that the index is
within bounds relies upon some number theory the compiler was never going
to understand without its own proof engine: great! You should probably
be explaining that in a comment in C++ anyway, and so the conventional
comment that goes with the unsafe
block in Rust is a perfect place to
explain it.
But maybe I'm wrong about all of this. Maybe your experience hasn't
matched mine. Maybe your particular application needs to make unchecked
array accesses a lot, needs them to be unchecked, and needs them littered
all over the codebase. I raise my eyebrows at you, suspect you need more
iterators and perhaps other abstractions, and wonder what problem you're
trying to solve. But even if you're absolutely right, I think it's still
a better idea to write Rust littered with unsafe
every time you index
an array, than to write C++.
Because, as I keep emphasizing, Rust is still a better unsafe programming language than C++. It would be better than C++ even if safety weren't a feature.
Post-Script: Some Perspective for the New Rustacean
I understand where this straw man argument comes from. The word
unsafe
is scary, and advice, especially aimed at people coming
from safe languages like Python and Javascript, is to avoid unsafe
features while learning. And while I think adding unsafe
to production
code should only be done once you've exhausted safe possibilities -- which
requires full understanding of safe possibilities -- this advice can
feel overbearing for a transitioning C++ programmer, especially when
it is immediately obvious that the safe features are very constrained
and can't literally do everything.
For that good-faith recovering C++ programmer, new to Rust: You're
right. The safe subset isn't enough to do everything you want to
do. And when it doesn't, that doesn't mean it failed. Its goal is to
make unsafe code rare, not non-existent. But it might surprise you
how rarely you truly need unsafe
. And a good resource for you
might be, as it was for me, the excellent Learn Rust the Dangerous
Way by Cliff L. Biffle.
For what it's worth, however, this criticism of Rust in general is often
levelled either in bad faith, or from a misunderstanding of what the
unsafe
keyword is for. For all the philosophical discussion of what
unsafe
truly means -- and how it interacts with the surrounding
module and encapsulation/privacy boundaries -- as well as principled
conventions for using it, please see the
Rustonomicon, the canonical
book on unsafe Rust, the same way the book
is canonical for introducing Rust.
Other criticisms of Rust from an HFT or low-latency point of view
are more relevant. Most specifically, gcc
and icc
are much better
compilers for those use cases -- empirically -- than is LLVM. Also,
the large codebases existing in C++ are often tested and contain
thousands upon thousands of programmer-years of optimizations and
bugfixes, where even small compiler upgrades are scrutinized closely
for performance regressions. Migrating to another programming language
from that starting point would be prohibitively expensive.
None of which is to say that if Rust gradually replaced C++ altogether, eventually such ultra-optimizing compilers and ultra-optimized codebases wouldn't start appearing in Rust. I hope to see that day within my lifetime.
Common Complaints about Rust
Before I get into the specific topics, though, I'd like to clear up a few talking points I've seen in the discourse. Some of these seem a little silly to me, but they're not exaggerations.
Rust fans all want to rewrite everything in Rust immediately as a panacea.
Unfortunately, we have a vocal minority who do! But most Rust developers have a much more moderate perspective. Most are aware that a large project cannot and should not be rewritten lightly.
Rust is a fad
Rust might be popular right now, but it also is already a part of a lot of critical infrastructure, and is currently being put in the Linux kernel.
Rust fans are all young and naive people with little real-life programming experience
Sure, there's fans of all languages who are like that.
But there's also Bryan Cantrill embracing Rust after a lifetime of disliking C++. There's Linus Torvalds allowing it in the kernel, after his very vocal anti-C++ statements.
There's a long tradition of people who like C, but don't like C++. See the Frequently Questioned Answers for a taste. Their criticisms are, in many cases, legitimate. This is unfortunate, because C has such limited capacity for abstraction, and so they're missing out on all the abstractive power that a higher-level language can provide.
For some of these people, Rust addresses the most important criticisms.
Personally, I agreed with many of those criticisms, personally, but was a professional C++ programmer for 5 years, working in positions where the zero-cost abstractions were absolutely necessary. I wasn't in a position to choose programming languages at the company where I was, but I agreed with the choice of C++ over C, in spite of what in my mind were the clear costs of overcomplexity. I enjoy Rust now because it addresses many of those problems.
But you must at least admit that Rust fans are annoying.
Many of the ones that annoy you, annoy me too, if not more.
I also know that I've annoyed C++ fans by advocating for Rust on the C++ subreddit. My post was taken down by moderators as irrelevant to C++, but how could you be more relevant than a thorough critique?
Rust will be in the same place as C++ in 40 years Rust will end up as convoluted as C++ is now.
This might just be true, but I don't get why it's used as an anti-Rust argument. If there needs to be a new systems language in another 30 or 40 years that reboots Rust like Rust is rebooting C++, I don't see that as a failure. I certainly don't see that as a reason not to use Rust now. And when the new programming language comes around to out-Rust Rust, I'll advocate switching to that too.
That would just mean that programming languages are subject to entropy and obsolescence like everything else. And in that case, C++ will just continue to get worse in the meantime too, so Rust will be better than C++ the entire time. If all programming languages accrue cruft as they age, in what world is that a reason to use the cruftier programming language? Isn't that a reason to use the newest appropriate programming language?
Most Rustaceans are not, despite the stereotype, treating Rust as some apocalyptic, messianic programming language to end all programming languages. The goal isn't to have an eternally good programming languages; programming languages are tools. We switch to better tools when it is practical to do so. The question is: What should new projects be written in now? When a rewrite is called for (as it sometimes is), should it include a new programming language now that there is a viable alternative?
I suspect that many making this argument are including an unstated assumption -- that C++'s cruft is actually a sign of its maturity, and fitness for production use. Alternatively, and a little more charitably, they might assume that Rust isn't ready for production use yet, and by the time it is, it will be just as crufty as C++, perhaps converging to the same level of cruft. But while there are a few categories where Rust lags C++, they are mistaken in the big picture. For the vast majority of C++ projects, Rust is already a better option for if the project had to be rewritten from scratch (a big "if," but irrelevant to the merits of the programming languages).
But also: Maybe Rust will be able to avoid some of C++'s mistakes; it's certainly trying to.
No programming language is better than another; there's simply different tools for different jobs.
I have a hard time taking this line of argument very seriously, and yet it comes up a lot. There are some tools for which almost no job is the right job. There are some tools that are just worse than other tools. No one uses VHS tapes anymore; there's no job for which they're the right tool.
Programming languages are technology. Some technologies simply dominate others. There are currently still some things that the C++ ecosystem has that Rust doesn't yet: I'm thinking about the GUI library space, and gcc support. Also, C++ has undeniably better interoperability with C, which is relevant.
But those things might change. There is no natural reason why C++ and Rust would be on equal fitting, or why Rust wouldn't at some point in the future be better than C++ at literally every single thing besides support for legacy C++ codebases. Some tools are simply better than others. No one's writing new production code in COBOL anymore; it's a bad tool for a new project.
C++ undefined behavior is avoidable if you're actually good at C++/if you just try harder and learn the job skills. You just have to use established best practices and a lot of problems go away.
First off, my experience working at a low-latency C++ shop shows that that's not true. Avoiding undefined behavior in high-performance C++ is extremely hard. It's hard in Rust too, but at least Rust gives you tools to manage this risk explicitly. If you're avoiding memory corruption errors in C++, you've either found a safe subset, or you're coding easy problems, or likely both.
But even if there are use cases where this is true, to me that means that an experienced C++ programmer can be just as good at avoiding undefined behavior as a novice Rust programmer. So what does this mean for a business considering whether to use C++ or Rust? In C++ everything a junior programmer writes requires more scrutiny from senior programmers. Everyone requires more training and more time to do things correctly. It's not that good of a selling point.
Similarly, using best practices makes it sound easy, or at least achievable. But the more complicated and arcane best practices are, and the higher the stakes of following them, the higher the cognitive load on the programmers, and again, the more you need senior programmers to look over everyone else's work or even do parts of the work themselves.
When we've been doing something the hard way for a long time, and it's successful, it's tempting to see other people struggling and to tell them it's not that hard, that they can just up their knowledge and their work ethic and do it the hard way like us. But in the end, everyone benefits if the work is just easier with better tools.
And what's a better tool than a "best practice"? An error message. A lint. A programming language structured in such a way that it doesn't even come up.
Programming language is a matter of personal preference.
For your hobby project, sure, this is true. But there are real differences between programming languages in terms of many things that matter for business purposes.
The existence of
unsafe
defeats the purpose of Rust. You have to useunsafe
, and since the standard library uses it, you're almost certainly using it too. That makes Rust unsafe just like C++ in practice, and so there's no advantage to switching.
I would call this a straw man, but people do call Rust a "safe"
programming language, and some people say you should never have to use
unsafe
(which I disagree with). So this takes some addressing.
First of all, memory safety, while important, is not the only
purpose of Rust. I would switch to Rust even if it didn't have the
unsafe
keyword. There are many other problems about C++, and this
book focuses primarily on the other problems.
Second of all, in every "memory-safe" language, safe abstractions are
built from unsafe foundations. You have to -- assembly language is
unsafe. unsafe
allows those foundations to be written in Rust.
And that is what a memory safe language is, not one that is 100%
memory safe in all situations, but one in which it's possible to
explicitly manage and scope memory safety, and do most regular
tasks in the safe subset.
You can't both have the guard rails that Rust provides and write certain
types of high-performance code at the same time, but the unsafe
keyword allows you to make the decision on whether to have your cake or
eat it on a situation-by-situation basis, rather than giving up one or
the other for the entire programming language.
If you don't use unsafe
, and you trust the libraries you import, then
you're in a safe language. If you do use unsafe
, you are temporarily
in a language as flexible as C++, while still having many advantages of
Rust -- including safety features, which are still fully in place for
most programming constructs even in unsafe
blocks.
Use the unsafe
code to build more safe components, and expand the safe
language, and you get to only worry about safety a small percentage of
the time, as opposed to all the time in C++.
Rust is taking the easy way out. Or: You can do C++ well, you just have to work harder at it, so there's no point to Rust.
I do think people regularly underestimate their ability to write safe C++. Other people underestimate how much performance they're giving up on by making sure they're confident their C++ is safe.
But even if you have put in the work to be good at writing C++ safely, why does that mean that someone else shouldn't be happy to get the same results with less training and less work, if the technology exists?
Who wouldn't want to take the easy way out? Do you exit your house by climbing through the windows? This phrase only makes sense when there's a downside, in which case the response depends on the alleged downside. In which case, the actual downside is more important than this rhetorical trick.
Because, after all, businesses should use programming languages that make programming easier.
Safety means that Rust is not as high-performance
It's true that some operations in Rust are checked by default which are unchecked by default in C++. Array indexing is the typical example.
However, both checked and unchecked indexing are available in both
Rust and C++. The difference is in Rust, to use the unchecked one,
you have to use a named method and an unsafe
block. This is easy
enough to do in situations where indexing matters.
Most code is not the tight loops in performance-sensitive parts of performance-sensitive code. Most code by volume is configuration and other situations where the check is well worth it to prevent the possibility of memory corruption. Rust does make a less performant default decision than C++, but it is not that hard to override, and then you still get all the other benefits of Rust.
All programming languages have foot-guns.
Some have more than others. In some you run across them more frequently than others. And in some, they come with a safety.
Modern C++ fixes the problems with C++.
Modern C++ has all the bad features of pre-modern C++. It has to, to be compatible with it.
In my experience, it's not enough to have good features. To make promises about memory safety, or even to have a sane programming ecosystem, it's also important to not have bad features. And redundant features of varying quality are often the worst of both worlds.
Response to Dr. Stroustrup's Memory Safety Comments
The NSA recently published a Cybersecurity Information Sheet about the importance of memory safety, where they recommended moving from memory-unsafe programming languages (like C and C++) to memory-safe ones (like Rust). Dr. Bjarne Stroustrup, the original creator of C++, has made some waves with his response.
To be honest, I was disappointed. As a current die-hard Rustacean and former die-hard C++ programmer, I have thought (and blogged) quite a bit about the topic of Rust vs C++. Unfortunately, I feel that in spite of the exhortation in his title to "think seriously about safety," Dr. Stroustrup was not in fact thinking seriously himself. Instead of engaging conceptually with the article, he seems to have reflexively thrown together some talking points -- some of them very stale -- not realizing that they mostly are not even relevant to the NSA's Cybersecurity Information Sheet, let alone a thoughtful rebuttal of it.
Fortunately, he does eventually discuss his own ideas of how to make C++ memory safe -- in the future. If these ideas are implemented well, it will make C++ a safe programming language as the NSA's Cybersecurity Information Sheet has defined it. But given that they are currently just proposals in an early stage, it's unfair of him to expect the NSA to mention them when advising people on what programming language to use. C++ has been an unsafe language for a long time. Maybe someday that will change, but we'll believe it when we actually see it.
But before I discuss that, I'd like to rebut and discuss my disappointment at the talking points he uses earlier in his response, because I think they unfairly frame the debate, shield C++ from legitimate and important criticism, and slander memory-safe programming languages and downplay memory safety as a concept, even though it's very important.
Multiple Types of Safety?
One of the most interesting and conceptually relevant points that Dr. Stroustrup harps on is that memory safety is not the only type of safety:
Also, as described, “safe” is limited to memory safety, leaving out on the order of a dozen other ways that a language could (and will) be used to violate some form of safety and security.
This might technically be true -- it's not entirly clear what other forms of "safety" he's talking about -- but it's misleading. Memory unsafety is not just one of a dozen equally important forms of "unsafety." Rather, memory unsafety is by far the biggest source of security vulnerabilities and instability in memory unsafe programming languages -- estimates as high as 70 percent in some contexts.
A 70% decrease in security vulnerabilities is worth committing significant resources towards. Memory safety on its own is worth writing a Cybersecurity Information Sheet about, and it is the area where C++ has the most serious deficits. Given that, this feels like a car manufacturer whose cars do not provide air bags responding to a government advisory not to buy the C++ cars by saying "What about other types of safety? By talking just about air bags, the government is clearly not thinking seriously about safety." Sure, there's other types of safety features besides air bags (or memory safety), but air bags are still important!
So, Dr. Stroustrup, what about memory safety in C++? Shouldn't C++ have memory safety? Are you saying it's not important, especially when all of these other programming languages have it?
Of course, he doesn't go into detail about other types of safety, which is telling. Of course, it's because C++ doesn't really have the advantage in any of them. For example, Rust also has a lot of mechanisms for thread safety and type safety, intimately connected with its memory safety mechanisms, and baked into the design of Rust in a way that would be next to impossible to retrofit into another programming language.
And, when you read later on about the "safety profiles" in the C++ Core Guidelines that he makes such a big deal about, most of the focus there is also about memory safety.
Petty Irrelevancies
Let's look at some of the other points he makes.
That specifically and explicitly excludes C and C++ as unsafe.
C++ does not enforce memory safety as a feature of the programming language. This may change in the future (as Dr. Stroustrup discusses), but is the current state of things. Dr. Stroustrup tries to downplay this, but is not convincing.
As is far too common, it lumps C and C++ into the single category C/C++, ignoring 30+ years of progress.
Writing "C/C++" to mean "C and C++" is considered a faux pas among C++ programmers, and among C programmers as well, because it is seen as asserting that these two programming languages are near-identical when there are in fact major differences between them. By pointing out that the NSA does this, Dr. Stroustrup is trying to make them look like they don't know what they're talking about, just because they used a "/" character instead of the word "and."
He's reading too much into the orthography and the NSA's failure to use insider shibboleths of the programming languages they're trying to criticize. Outside of the "C" and "C++" communities, "C/C++" is a fairly common way to refer to the two related programming languages.
And that's the most relevant thing here: C and C++ are indeed related programming languages, and they have a lot in common: They are both compiled programming languages with a focus on performance, and they are (very relevantly) both not particularly focused on guaranteeing memory safety. C and C++ have a substantial common subset, with many memory unsafe features that are popular with programmers, perhaps even more popular because they work similarly in both programming languages. For the purposes of this document, it's often the features that C and C++ have in common that are the problematic ones, so it makes sense for the NSA to lump them together.
While there might be 30+ years of divergence between C and C++, none of C++'s so-called "progress" involved removing memory-unsafe C features from C++, many of which are still in common use, and many of which still make memory safety in C++ near intractible. Sure, new features in C++ have been added that (in some but by no means all cases) do not make it as easy to corrupt memory, but the bad old features are not in any real way being phased out: They are not guarded by any special opt-in syntax, nor in many cases do they result in warnings. Given that, the combined set of features is as strong as its weakest link.
Unfortunately, much C++ use is also stuck in the distant past, ignoring improvements, including ways of dramatically improving safety.
This is a common C++ talking point, but it doesn't help Dr. Stroustrup's position as much as he thinks it does.
He's trying to talk up how much C++ has improved, especially in the last 11 years -- and it has indeed improved. New ways of writing C++, emphasizing relatively new features, can indeed result in more reliable C++ code with less memory corruption.
But unfortunately, this talking point just serves to remind us that these
old memory-unsafe features are still in common use. When someone says
their project is written in Rust, we can guess that it likely uses only
the safe features (including using standard library functions that use
unsafe
internally -- that truly doesn't count as unsafe), or maybe
uses the unsafe features when absolutely necessary. But when someone
says their project is written in C++, by Dr. Stroustrup's own admission,
there's a high likelihood that it uses old features "stuck in the distant
past, ignoring ... ways of dramatically improving safety." This is also
a reason to avoid C++.
However, I would also contest his claim about these new features. Memory safety isn't just an absence of memory corruption, but a reliable method for ensuring the absence of memory corruption. "Using new features" isn't good enough. Even if using the new features in preference to the old ones were a guarantee of memory safety -- which it isn't, they're less memory corrupting but not truly memory safe -- the presence of the old ones would still cause problems. You would need some mechanism to ensure that the new features were only used safely, and that the old features were not used, and no such mechanism exists, at least not in the programming language itself. Someone who remembers the old features can always still slip up and use one by accident.
Static Analysis: Not Good Enough
Dr. Stroustrup points out that he's been working very hard on improving memory safety in C++, for a very long time:
After all, I have worked for decades to make it possible to write better, safer, and more efficient C++. In particular, the work on the C++ Core Guidelines specifically aims at delivering statically guaranteed type-safe and resource-safe C++ for people who need that without disrupting code bases that can manage without such strong guarantees or introducing additional tool chains.
Unfortunately, it's not done. The key word here is, of course, "aims." The next sentences admit that this feature is not in fact available:
For example, the Microsoft Visual Studio analyzer and its memory-safety profile deliver much of the CG support today and any good static analyzer (e.g., Clang tidy, that has some CG support) could be made to completely deliver those guarantees....
For memory safety, "much of" is not really good enough, and "could be made" is practically worthless. Fundamentally, the point is that memory safety in C++ is a project being actively worked on, and close to existing. Meanwhile, Rust (and Swift, C#, Java, and others) already implements memory safety.
It's worse than that, though. What Dr. Stroustrup is trying to downplay is that this involves using static analyzers, considered separate from the programming language, something the NSA's original article also discusses. Theoretically, if a static analyzer could be used to guarantee memory safety, that could be just as reliable as a programming language that does it. An engineering team could have a policy that all code must pass this static analysis before being put into production.
But unfortunately, human nature is more fickle than that. If it's not built into the programming language, it's going to get skipped. If a vendor says their software is written in C++, or if an engineer takes a job in C++, how will they know that these static analyzers will in fact be used? A programming language that takes memory safety seriously doesn't provide it as an optional add-on that most people will simply ignore.
But All The C++ Code!
The end of the last quote provides a common talking point in Rust vs C++ arguments:
[Static analyzers] could be made to completely deliver those guarantees at a fraction of the cost of a change to a variety of novel “safe” languages.
Besides the laughably condescending matter of calling Java (which first appeared in 1995), C# (first appeared in 2000), and Ruby (first appeared in 1995) "novel," this is a jab at a common trope that (some immature) Rust programmers go around demanding that people rewrite their projects in Rust (please don't do this!), and an attack on the idea that all code can be written in safe programming languages, given the large body of existing work in unsafe programming languages.
This is a bit of a straw man in this context. The NSA article that Stroustrup is responding to addresses that switching existing codebases might be expensive, even prohibitively so, saying:
It is not trivial to shift a mature software development infrastructure from one computer language to another. Skilled programmers need to be trained in a new language and there is an efficiency hit when using a new language. Programmers must endure a learning curve and work their way through any “newbie” mistakes. While another approach is to hire programmers skilled in a memory safe language, they too will have their own learning curve for understanding the existing code base and the domain in which the software will function.
It then follows this up immediately with an explanation of how tools like static analyzers can be used as a back-up plan for improving memory safety in memory unsafe programming languages -- exactly what Dr. Stroustrup discusses. He's criticizing this NSA document, implying it is not thinking "seriously," while fundamentally making a point that they already made for him.
Of course, this is a terrible endorsement of C++. It's far from ideal to have to use add-on tools to work around a language's flaws. Coming from Dr. Stroustrup, it reads more like a brag that his programming language has locked everyone in than a defense of why C++ is good. Or else, it's an admission that other programming languages should be used for new projects, and that C++'s fate is now to gradually fade like the elves from Middle Earth.
But he's also overstating his case. As I mention before, safe programming languages have existed for a long time. Many programming projects that in the early 90's would have been done in C or C++ have in fact been done in safe programming languages instead, and according to the NSA's recommendation, that was a good idea. As computers have gotten faster and programming language technology has improved, there has been fewer and fewer reasons to settle for languages like C or C++ that don't have memory safety as a feature.
When I was a professional C++ programmer as early as 2013, some people -- even some programmers -- already thought that C++ was a legacy programming language like COBOL or Fortran. And outside of narrow niches like systems programming (e.g. web browsers, operating systems, and lower-level libraries), video games, or high performance programming, it kind of has become one. The former application niches of C++ have been taken over by Java and C#, or more recently by Go. If you have an application program written in C++, chances are that it's a relatively old codebase, or written at a shop that has reasons to write a lot of C++ (such as a high-frequency trading firm).
Now, even C++'s systems niche is under threat, with Rust, a powerful memory-safe programming language that avoids many of C++'s problems. Now, even the niches where C++ isn't at all "legacy" have a viable, memory-safe alternative without a lot of the technical debt that C++ has. Rust is even allowed in the Linux kernel, a project that has only previously accepted C, and whose chief maintainer has always explicitly hated C++.
A Memory-Safe C++
Fortunately, after all of these ill-thought out, tired talking points, Dr. Stroustrup subtly changes his perspective. After his distractions, after bashing memory safe programming languages as "novel," bragging about how C++ is too entrenched to be removable, pretending memory safety is just one of many equally important safety issues, and promising optional add-on tools that will eventually be standardized, he finally begins to tackle the question of how C++ could be made memory safe, in an opt-in fashion:
There is not just one definition of “safety”, and we can achieve a variety of kinds of safety through a combination of programming styles, support libraries, and enforcement through static analysis. P2410r0 gives a brief summary of the approach. I envision compiler options and code annotations for requesting rules to be enforced. The most obvious would be to request guaranteed full type-and-resource safety. P2687R0 is a start on how the standard can support this, R1 will be more specific. Naturally, comments and suggestions are most welcome.
...
For example, in application domains where performance is the main concern, the P2687R0 approach lets you apply the safety guarantees only where required and use your favorite tuning techniques where needed. Partial adoption of some of the rules (e.g., rules for range checking and initialization) is likely to be important. Gradual adoption of safety rules and adoption of differing safety rules will be important. If for no other reason than the billions of lines of C++ code will not magically disappear, and even “safe” code (in any language) will have to call traditional C or C++ code or be called by traditional code that does not offer specific safety guarantees.
This is a lot closer to what the NSA document actually specifies for memory safe programming languages than he gives the document credit for. For example, the document already provides for opting out of memory safety via annotation, paired with an observation that that will focus scrutiny on the code that opts out.
Dr. Stroustrup did not need to criticize the document for not thinking "seriously" to reach this conclusion, but simply acknowledge that it's true that C++ is not a memory safe programming language yet, but that based on his work, it might soon become one. Maybe the next version of the NSA document will endorse using C++, but only if it's C++ZZ -- where ZZ is some future version of the C++ standard.
I'm glad comments and suggestions are welcome, however, because I have a huge one.
Opt-in for memory safety is unacceptable, and is almost as bad as having
a separate static analysis tool to enforce safety. Opt-out is fine --
Rust has a way to opt out of memory safety with the unsafe
keyword, and
this concept is discussed and defended in the NSA's original document. But
the default should be to enforce memory safety unless otherwise specified.
For C++, this means that if these safety features are added in C++ZZ,
--std=c++ZZ
should cause unsafe constructs to be rejected -- and the
C++ standard should require that these constructs be rejected for an
implementation to be a conforming implementation of C++ZZ. Perhaps (but
only perhaps) other command line arguments could be added to override
this constraint on a file-by-file basis. Ideally, a new compiler command
(e.g. g++ZZ
) should be created for each implementation that defaults
to this stricter behavior.
Parts of the codebase that use legacy features should have to
have at least a file-level annotation that that file is a legacy file --
and then this annotation could gradually be moved to the function level.
As a side benefit, this could also be used to phase out and deprecate
weird points of C++ syntax, similar to the Rust edition system: Anyone
using, for example, 0 literals to mean nullptr
would have to declare
some sort of a legacy annotation on their file or in their build system.
Only with this sort of opt-out memory-safety system would I consider C++ a memory safe programming language. I'd be very happy to see a memory-safe C++. I earnestly hope Dr. Stroustrup is successful in his endeavors. I'm not holding my breath, though, and in the meantime, I will continue to use other programming languages, that are already memory-safe, for my new projects, as will the majority of programmers.
In the meantime, it is unfair for Dr. Stroustrup to call safe programming languages novelties or to pretend that C++ isn't already far behind the times on this. This was already an important criticism of C++ decades ago, when Java first came out in the 90's and was referred to as a "managed programming language." This was discussed in detail in my classes when I was a college student in the late aughts. To read Dr. Stroustrup's writing, C++ is being criticized by "novel" upstarts when it is well on its way to getting the feature, but in actuality, the time to act was 1996.