Rust: A New Attempt at C++’s Main Goal
I know I set the goal for myself of doing less polemics and more education, but here I return for another Rust vs C++ post. I did say I doubted I would be able to get fully away from polemics, however, and I genuinely think this post will help contextualize the general Rust vs. C++ debate and contribute to the conversation. Besides, most of the outlining and thinking for this post – which is the majority of the work of writing – was already done when I set that goal. It also serves as a bit of conceptual glue, structuring and contextualizing many of my existing posts. So please bear with me as I say more on the topic of Rust and C++.
Rust is a polarizing programming language, because of how radical it is. It has gone the furthest in introducing features from functional programming languages into the mainstream world, and ignoring long-held programming language design principles from the realm of object-oriented programming. Its fans can be very enthusiastic, sometimes off-puttingly so, stereotypically demanding that all software be rewritten in Rust even when completely unfeasible – a stereotype that is mostly untrue, but whose existence and occasional true examples shows the intensity of the debate. But a lot of Rust’s criticism comes specifically from C++ programmers, and correspondingly a lot of Rustaceans’ criticisms of other programming languages is directed specifically at C++, including mine. Even the creator of C++, while not mentioning it by name, entered the fray (and along with other Rustaceans, I responded).
There’s a good reason for this particular rivalry. While usable in other domains, Rust is strongest where C++ has hitherto been unopposed: as a high-level systems programming language. Many of Rust’s greatest strengths are directly based off of ideas originated in C++. And Rust has, in many ways, the same goals that C++ has. It can be argued – and in this post I shall argue – that Rust has the exact same overall goal that C++ does, albeit with a different interpretation of how that goal is best accomplished.
Zero-Cost Abstractions#
C++ has an explicit goal of providing zero-cost abstractions.
This is a bit of a confusing term of art and has the potential to be misleading, but it comes attached with explanations that clarify it some. It is also referred to as the “zero-overhead principle,” which Dr. Bjarne Stroustrup, father of C++, explains (see pg. 4) describes as containing two components:
- What you don’t use, you don’t pay for (and Dr. Stroustrup means “paying” in the sense of performance costs, e.g. in higher latency, slower throughput, or higher memory usage)
- What you do use, you couldn’t hand code any better
There is also an executive summary of the concept at CppReference.com.
I, however, prefer the terminology of “zero-cost abstraction,” confusing as it can be, because it embodies a hidden third principle, that is unstated among those other two, and against which those other two principles are balanced. The word “abstraction” is the key, and the third principle is:
- You can still get the abstractive and expressive power you expect from a modern programming language.
This third principle is necessary to distinguish higher-level “zero cost” languages like C++ and Rust from lower-cost languages like C.
To fully explain why I include this third principle, and to delve into the history of the concept in general, I want to talk more about C.
C: The Portable Assembly#
C has often been described as a “portable assembly language.” Unlike other high level programming languages before it (“high level” at the time meaning anything higher level than raw assembly language), it exposed users directly to gnarly machine-language abstractions like pointers, and to common assembly-language capabilities like shifting and bitwise operators.
The goal was to give the programmer something minimally distinct from assembly language, where the programmer had almost as much control over the computer as an assembly language programmer without sacrificing portability. Few higher-level features have been added, even now: there was no built-in string type, and only a limited array type that exposed the underlying concept of pointers the instant you poked at it. Structures are little more than a way of calculating offsets, and memory management is done by explicitly invoking memory management routines.
C’s preference, in general, was to only add onto assembly those features absolutely necessary for portability, and not to impose any other structure on the programmer – or, said another way, not to provide any other structure to the programmer.
This was far from an iron-clad rule. And there are definitely exceptions: C, built into the programming language, prefers null-terminated strings (also known as “C strings”) to arrangements that use specific lengths, a substantial constraint on the programmer beyond assembly language and probably a mistake overall.
More deeply, and probably less avoidably at the time, C assumes a traditional call structure. Many techniques that can be used to implement closures, co-routines, or other more radical alternatives to a call stack are difficult to impossible to do with standard C – while generally being possible in any assembly language.
But, with these exceptions, C generally does tend to only provide one overarching abstraction, portability, and when it does, it has the same zero-cost goals that C++ has, to only make the user pay for the abstractions they actually use, and to provide abstractions as efficiently as the equivalent hand-coded assembly.
Put another way, C++’s zero-cost overhead principle, as Dr. Stroustrup defines it, is more or less inherited from C. Where C++ differs from C is in the “abstraction” part of providing “zero-cost abstractions.” Everything you can do in C++ you can do in (potentially tedious and repetitive and error-prone) C, but C++ provides more abstractions, beyond just what is necessary for portability.
C++: A More Abstracted C#
This gives us a framework for understanding the entire goal of C++, and I would argue, of Rust. Once we understand that C++ is trying to keep the zero-cost principle of C, where abstractions do not come with a performance penalty (and where “zero” is a reference to the difference between the performance cost and a manual assembly-language implementation), but with the expressive and abstractive power of a higher-level programming language, everything else about C++ makes sense.
C++ was originally christened “C with Classes,” and it tried to add
Object-Oriented Programming to C. All the mechanisms of OOP could be
portably added to C directly by an application or library developer
with judicious use of function pointers and structure nesting (and
glib
is a famous example of a
library that does exactly that), but C++ built this abstraction
into the programming language itself.
Objective-C also did this (and according to Wikipedia it “first appeared”
one year sooner in 1984), but Objective-C has always felt like two
programming languages glued together. In Objective-C, the object-oriented
features do not inherit the zero-overhead principle from C – nor do they
look like C at all. They look instead like a Smalltalk dialect, where
switching between C and this odd Smalltalk dialect was permitted on an
expression-by-expression basis using an odd mix of square brackets and
@
-signs.
In C++, the added abstractions, including OOP, take on more of a resemblance to C, and importantly, continue to try to retain C’s advantages in systems programming by making the new features zero-overhead.
During much of the history of C++, OOP was considered to be the most important abstraction that a programming language could offer. But once it was added, it expanded the scope of C++ abstractions. Nowadays, C++ is considered multi-paradigm, and provides not just OOP, but a wide array of abstraction.
Nowadays, C++ tries to keep up with other programming languages in what features it offers, to the extent that it can while being limited by the zero-cost principle. This is in sharp contrast to C, which continues to try to define existing features better and make them more rigorous within the existing feature scope. The only features C++ rejects out of hand are those that do not jive with zero-cost abstraction, showing that in actuality C++’s defining trait is to have the three-pronged concept of zero-cost abstraction that I introduced above, two prongs about “zero cost” and one about “abstraction”:
- What you don’t use, you don’t pay for
- What you do use, you couldn’t hand code any better
- We give you the power of abstraction expected for a programming language of the day
This is why garbage-collection is not offered in C++ (though it is still
possible to implement manually) – it cannot be offered in a zero-cost
way. However, C++’s alternative to garbage collection, namely
RAII, continues to become more effective as new features
like move semantics and std::unique_ptr
were added, to the extent that
in modern C++, it would be unimaginable not to have those features,
and they have become essential to C++’s memory management model.
These three goals explain why C++ keeps accruing new features, whereas C maintains the features it has. They explain why C++ had to add templates – as a zero-cost alternative to OOP, or a zero-cost way of implementing collections. They explain why C++ had to add move semantics – because without it, RAII is a worse abstraction than GC.
Rust: A C++ Redo#
Rust simply does a better job at achieving these goals, because Rust gets to start from scratch, with the modern concept of what’s expected in a high-level programming language, rather than working forwards through time. And, in doing so, it avoids a lot of the mistakes that C++ made, and can design a language that includes all of the modern features together.
A full set of OOP features is no longer ideologically required, so Rust
doesn’t offer them. Instead, safety has become
a sine qua non, so Rust offers that (with an opt-out provision).
One might argue that safety violates the zero-cost abstraction because
of bounds checking, but that’s simply not true as defined. You only
pay for bounds checks if you’re actually using the feature of safety
– unchecked unsafe accesses are in fact available just an unsafe
keyword
away – and the feature of safety is implemented as
efficiently as one would by hand (by inserting bounds checks into array
accesses).
Similarly, C++ has learned that move semantics turn out to be essential
in an RAII/value-semantics model to avoid spurious copy-and-deletes and/or
indirections for e.g. storing std::string
s in a std::vector
that might
be resized. Before move semantics, C++ often forced violations of the
zero-cost abstraction principle by providing abstractions that would do
extraneous copies or required extra indirections to use effectively, which
is not what an assembly language programmer would ever write. However,
since C++ move semantics were bolted on after the fact, it does them in a
deeply confusing way, where Rust gets to reset and
design itself for destructive moves from the get-go.
A Note on “the RAII Model”#
In my RAII post I referred to C++’s alternative
to garbage collection, centered on RAII, as the “RAII model,” and wrote
that std::unique_ptr
and move semantics were essential to this model.
A Reddit comment later explained that I must be confused, because RAII
pre-dates those features.
They had misunderstood me, and I stand by my statements, but I think it is worth some clarification. By “RAII model,” I mean RAII and other features which, when combined, provide an alternative to garbage collection. And the RAII model before C++11 did indeed lack features essential to competing with garbage collection. It was simply a worse model then, and much harder to use correctly in a complicated codebase.
In a similar way, I would say that in Rust, borrow checking and
destructive moves are essential to the RAII model, because without it,
the model is a much worse competitor to garbage collection. And yes,
that does imply that C++’s concept of RAII is fundamentally deficient
by not being paired with borrow checking, just like pre-C++11 RAII was
fundamentally deficient by not being paired with move semantics and
std::unique_ptr
.
The alternative to garbage collection that C++ and Rust have built has been a work in progress through most of its history. Rust had to be a new programming language rather than an evolution for a number of reasons, but fixing C++’s lack of borrow checking and weird move semantics were some of the most important such reasons.
Backwards-Compatibility#
Of course, C++ does have goals that Rust drops – and in doing so, it can do better at this core goal. The biggest such goal is perhaps also a trivial example: C++ has the goal of being source-compatible with earlier versions of C++, and even to some extent with C. This makes sense, as backwards-compatibility between versions is sort of a fundamental expectation of any programming language, certainly one that tries to provide a modern set of abstractions, but it does restrain C++’s development.
While Rust tries to be backwards compatible with itself, dropping compatibility with C++ has allowed it to get out of a lot of C++’s accumulated cruft of complexity, much of which is inherited from C times.
This accomplishes a lot on its own. C++’s syntax has gotten
so complex over the years that many in the C++ community are
doing their own resets of the syntax, including Herb Sutter’s
cppfront
and Google’s
Carbon. Even if
starting from scratch to accomplish C++’s goals was the only thing Rust
did, it would still result in a much better programming language, more
ergonomic and with fewer pitfalls.
Some criticize Rust by saying that in another 30 or 50 years, Rust will end up as convoluted as C++ is now. This criticism has confused me, because it seems possible, even likely, that this is true, but that doesn’t strike me as a reason to not (gradually and responsibly) switch from C++ to Rust (especially for new projects or for when rewrites are particularly called for). If this is true, that just means programming languages are subject to entropy and obsolescence like everything else. And in that case, C++ will just continue to get worse, Rust will also continue to get worse, and Rust will be better than C++ the entire time. If all programming languages accrue cruft as they age, in what world is that a reason to use the cruftier programming language?
Most Rustaceans are not, despite the stereotype, treating Rust as some apocalyptic, messianic programming language to end all programming languages. I wouldn’t be surprised if 20 or 30 years from now, a new programming language will emerge, accomplishing the same goals from a fresh start. And when that happens, I will probably advocate in favor of this new programming language just like I now advocate in favor of Rust.
The goal isn’t to have an eternally good programming languages; the goal is to have tools now. What should new projects be written in now? When a rewrite is called for (as it sometimes is), should it include a new programming language now that there is a viable alternative?
I suspect that many making this argument are including an unstated assumption – that C++’s cruft is actually a sign of its maturity, and fitness for production use. Alternatively, and a little more charitably, they might assume that Rust isn’t ready for production use yet, and by the time it is, it will be just as crufty as C++, perhaps converging to the same level of cruft. But while there are a few categories where Rust lags C++, they are mistaken in the big picture. For the vast majority of C++ projects, Rust is already a better option for if the project had to be rewritten from scratch (a big “if,” but irrelevant to the merits of the programming languages).
Rust Deficits#
Rust has a few downsides compared to C++.
Interfacing with C is an important goal for reasons besides backwards-compatibility. On many platforms, C serves as a lowest-common-denominator programming language, and its ABI serves as an inter-language protocol. C++ does provide smoother interfacing with this protocol than Rust does.
Relatedly, C++ generally has a relatively stable ABI on a given platform for a given compiler vendor. This allows dynamic libraries to be used as plugins with minimal glue code, something that in Rust normally requires awkwardly working through a C ABI interface. Personally, I think machine-language plugins as dynamically loaded libraries are mostly a relic of past software distribution models, and haven’t seen many situations where they make sense, but I could think of a few edge cases.
In both of these cases, Rust is clumsier, but not completely incapable. Rust still can speak the protocol that is the C ABI, just not as natively and smoothly-integrated as C++.
Other downsides of Rust have to do with network effects and Rust adoption. There is only one Rust compiler, while there are multiple C++ compilers, that work together through a standards process. GCC is currently in the process of getting Rust support, and we’ll see how well that works out for Rust.
Similarly, there are a lot of libraries that exist in C++ that don’t yet exist in Rust or have Rust bindings. Though that’s true of any pair of programming languages, it is a specific reason some developers might still want to write new projects in C++ in favor of Rust.
Finally, while I still think Rust would be a better programming language
than C++ even if unsafe code were allowed everywhere, I think Rust
could do more to make its rules clearer in the unsafe realm. The fact
that the latest research on Rust’s memory models seems so deeply
difficult to square with how async
code often works as in this
bug report
makes me nervous.
I’m sure there are other ways in which Rust is behind C++, and the devil is as always in the details. I’m sure I’ll find out about some of them as soon as I post this post.
Conclusion#
This was all topics I’ve discussed in other blog posts, but I hope this brings some perspective on how I think about the programming languages in general, and provides a conceptual framework for thinking about some of my other posts. I was a fan of C++ because of its goals, and I’m now a fan of Rust because I think Rust pulls them off better. When I was skeptical of Rust, it was because I did not think Rust would pull them off better, but that was due to a misunderstanding.
Next Steps#
I am considering using (a revised version of) this post
as an introduction, and then trying to bring all of my
Rust vs C++ content into an mdbook
so it could be more of a
garden.
It would have a title like “Rust: A Better C++ Than C++” and be licensed
under some CC non-commercial license, and it would accept MRs from
other people as a community resource for consolidating resources on this
particular issue. Then, if I had further ideas I could put them in there.
What do people think of that idea?
I realize now that I write this that the repo where I already have the bones of this idea is actually already public. I think I’m going to restart from scratch with just a reorganization existing blog posts, and save the more ambitious ideas in those notes files for later. What do people think?
Subscribe
Find out via e-mail when I make new posts! You can also use RSS (RSS for technical posts only) to subscribe!
Comments
If you want to send me something privately and anonymously, you can use my admonymous to admonish (or praise) me anonymously.
comments powered by Disqus