C++ Papercuts
UPDATE: Wow, this post has gotten popular! I’ve written a new post that adds new papercuts combined with concrete suggestions for how C++ could improve, if you are interested. Also, if you want to read more about C++’s deeper-than-papercut issues, I recommend specifically my post on its move semantics. Thank you for reading!
My current day job is now again a C++ role. And so, I find myself again focusing in this blog post on the downsides of C++.
Overall, I have found returning to active C++ dev to be exactly what
I expected: I still have the skills, and can still be effective in
it, but now that I have worked in a more modern programming language
with less legacy cruft, the downsides of C++ sting more. There are so
many features I miss from Rust, not only the obvious safety features,
or even primarily those, but also features that C++ could easily add,
like first-class support for sum types (called enum
s in Rust), or
tuples. (Clarification for C++ Fans: std::tuple
and std::variant
are not first class support, and if you’re used to first class
support, you know how unacceptably clunky they are.)
In this blog post, I will focus on the minor problems of C++ that have affected me the most, the little usability papercuts, the petty inconveniences that just waste time. Instead of focusing on comparing them to Rust or other programming languages, I will focus on why they don’t make sense from a C++ point of view, with reference to just C++. I know better than to hope that by doing this that die-hard C++ fans will accept my criticism, but perhaps it will be relatable to C++ programmers who don’t have Rust experience.
Before I start getting into the papercuts, though, I want to address one of the primary defenses I’ve seen of C++, one that I’ve found particularly baffling. It goes something like this:
C++ is a great programming language. The complaints are just from people who aren’t up to it. If they were better programmers, they’d appreciate the C++ way of doing things, and they wouldn’t need their hand held. Languages like Rust are not helpful for such true professionals.
Obviously, the phrasing is a bit of a parody, but I’ve seen this sort of attitude so many times. The most charitable view I can take of it is a claim that C++’s difficulty is a sign of its power, and the natural cost of using a powerful programming language. What it reads like to me in many cases, however, is as a form of elitism: a general idea that making things easy for poorer programmers is pointless, and that good programmers don’t benefit from making things easier.
As someone who has programmed C++ professionally for a majority of my career, and who has taught (company-internal) classes in advanced C++, this is nonsense to me. I do know how to navigate the many papercuts and foot-guns of C++, and am happy to do so when working on a C++ codebase. But experienced as I am, they still slow me down and distract me, taking focus away from the actual problems I’m trying to solve, and resulting in less maintainable code.
And as for the upside, I see very little – any way in which C++ is more performant or more appropriate than Rust is in terms of platform support, legacy codebases, optimizations that are only available in specific compilers that happen to not support Rust, or other concerns irrelevant to the actual design of the programming language.
While I am proud of my C++ skills, I am not too proud to appreciate that better technology can render them partially obsolete. I am not too proud to appreciate having features that make it easier. In most cases, it’s not a matter of the programming language doing more work for me, but of C++ creating unnecessary extra make-work, often due to decisions that made sense when they were made, but have long since stopped making sense – don’t get me started on header files!
But I also want my programming language to be beginner-friendly. I am always going to work with other programmers with a variety of skill-sets, and I would rather not have to clean up my colleagues' mistakes – or mistakes of earlier, more foolish versions of myself. If making a programming language more beginner-friendly sacrifices power, then I agree that some programming languages should not do it. But many, even most of C++’s beginner-unfriendly (and expert-annoying) features do not in fact make the language more powerful.
So, without further ado, here are the biggest papercuts I’ve noticed in the past month of returning to C++ development.
const
is not the default#
It is very easy to forget to mark a parameter const
when it can be.
You can just forget to type the keyword. This is especially true for
this
, which is an implicit parameter: there is no time when you are
typing out the this
parameter explicitly, and therefore it won’t sit
there looking funny without the appropriate modifiers.
If C++ had the opposite default, where every value, reference, and pointer
was const
unless explicitly declared mutable, then we’d be much more
likely to have every parameter declared correctly based on whether the
function needs to mutate it or not. If someone includes a mutable
keyword,
it would be because they know they need it. If they need it and forget it,
the compiler error would remind them.
Now, you might not think this is important, because you can just not
use const
and have functions with capabilities they don’t need –
but sometimes you have to take things by const
in C++. If you take a
parameter by non-const
reference, the caller can only use lvalues to
call your function. But if you take a parameter by const
reference,
the caller can use lvalues or rvalues. So some functions, in order to
be used in natural ways, must take their parameters by const
reference.
Once you have a const
reference, you can only (easily) call functions
with it that accept const
references, and so if any of those functions
forgot to declare the parameter const
, you have to include a const_cast
– or go change the function later to correctly accept const
.
Lest you think this is just a sloppy newbie error, note that
many functions in the standard library had to be updated to take
const_iterator
instead of or in addition to iterator
when it was
discovered correctly that they made sense with a const_iterator
:
functions like erase
. It turns out that for functions like erase
,
the collection is what has to be mutable, not the iterator – a fact
that the maintainers of the C++ library simply got wrong at first.
Obligatory Copying#
In C++, for an object to be copyable is the default, privileged way
for an object to behave. If you don’t want your object to be copyable,
and all its fields are copyable, you often have to mark the copy constructor
and copy assignment operator as = delete
. The default is for the compiler
to write code for you – code that can be incorrect.
If you do make your class move-only, however, beware, because that means that there are situations where you can’t use it. In C++11, there was no ergonomic way to do a lambda capture by move – which is usually how I want to capture variables into a closure. This was “fixed” in C++14 – for when you want what should have been the default from the beginning, you can now use extremely clunky move-capture syntax.
However, even then, good luck using the lambda. If you
want to put it in a std::function
, you’re still out
of luck to this day. std::function
expects the object
it manages to be copyable, and will fail to compile if your
closure object is move-only. This is going to be addressed in C++23, with
std::move_only_function
– but in the meantime, I have been forced to write classes with a copy
constructor that throws some sort of run-time logic exception. And even
in C++23, copyable functions will be the default, assumed situation.
This is strange, because most complicated objects, especially closures,
are never, and should never be, copied. Generally, copying a complicated
data structure is a mistake – a missing &
, or a missing std::move
.
But it is a mistake that carries no warning with it, and no visible
sign in the code that a complex, allocation-heavy action is being
undertaken. This is an early lesson to new C++ devs – don’t pass
non-primitive types by value – but it’s possible for even advanced devs
to mess up from time to time, and once it’s in the codebase, it’s easy
to miss.
By-Reference Parameter Papercuts#
It is unergonomic to return multiple values by tuple in C++. It can be
done, but the calls to std::tie
and std::make_tuple
are long-winded
and distracting, not to mention that you’ll be writing unidiomatically,
which is always bad for people who are reading and debugging your code.
Side note: Someone brought up structured bindings in a comment, as if this fixed the issue. Structured bindings are a great example of the half-way fixes that proponents of modern C++ love to cite. Structured bindings help some, but if you think they make returning by tuple ergonomic, you’re mistaken. You still need to either write
std::pair
orstd::make_tuple
in the function return statement, orstd::tuple
in the function’s return type. This isn’t the worst, but it’s still not as light-weight as full first-class tuple support, and it’s not enough to have convinced people to not use out parameters, which are my real complaint.And even at that, it’s not that out parameters (or in-out parameters) are bad, but that they’re bad in C++, as there is no good way to express them.
So what do we do instead? The clunkiness of tuples leads people to
instead use out parameters. To use an out parameter, you end up taking
a parameter by non-const
reference, meaning the function is supposed
to modify the parameter.
The problem is, this is only marked in the function signature. If you have a function that takes a parameter by reference, the parameter looks the same as a by-value parameter at the call site:
// Return false on failure. Modify size with actual message size,
// decreasing it if it contains more than one message.
bool got_message(const char *void mesg, size_t &size);
size_t size = buff.size();
got_message(buff.data(), size);
buff.resize(size);
If you’re reading the calling code quickly, it might look like the
resize
call is redundant, but it is not. size
is being modified by
got_message
, and the only way to know that it is being modified is to
look at the function signature, which is usually in another file.
Some people prefer out parameters and in-out parameters to be passed by pointer for this very reason:
bool got_message(const char *void mesg, size_t *size);
size_t size = buff.size();
got_message(buff.data(), &size);
buff.resize(size);
This is great – or would be, if pointers weren’t nullable. What does
a nullptr
parameter mean in this context? Is it going to trigger
undefined behavior? What if you pass a pointer from a caller into it?
People often forget to document what functions do with a null pointer.
This can be addressed with a non-nullable smart pointer, but very few programmers actually do this in practice. When something isn’t the default, it tends to not be used everywhere where appropriate. The sustainable answer to this is changing the default, not heroic attempts to fight human nature.
Obligatory side-gripe:
At least in non-owning situations like this, it is possible to write
such a smart pointer. However, if you want to write the obvious companion,
a non-nullable owning smart pointer, a companion version of std::unique_ptr
,
then it cannot be done in a useful way, because such a pointer cannot
then be moveable.
Method Implementations Can Contradict#
In C++, every time you write a class, especially a lower-level one, you have a responsibility to make decisions about certain methods with special semantic importance in the programming language:
- Constructor (Copy):
X(const X&)
- Constructor (Move):
X(X&&)
- Assignment (Copy):
operator=(const X&)
- Assignment (Move):
operator=(X&&)
- Destructor:
~X()
For many classes, the default implementations are enough, and if possible you should rely on them. Whether or not this is possible depends on whether naively copying all of the fields is a sensible way to copy the entire object, which is surprisingly easy to forget to consider.
But if you need a custom implementation of one of these, you are on the hook to write all of them. This is known as the “rule of 5.” You have to write all of them, even though the correct behavior of the two assignment operators can be completely determined by the appropriate constructor combined with the destructor. The compiler could make default implementations of the assignment operators that refer to those other functions, and therefore would always be correct, but it does not. Implementing them correctly is tricky, requiring techniques like either explicitly protecting against self-assignment, or swapping with a by-value parameter. In any case, they are boilerplate, and yet another thing that can go wrong in a programming language that has many such things.
Side note: One commentator did not understand what I meant. It is true that many classes can use
= default
for all these methods. However, IF you customize the copy constructor or move constructor, you must THEN also customize the assignment operator to match, even though the default implementation could have been correct, if the language was defined more intelligently.I thought this was clear by citing the rule of 5, which essentially says this.
The full rule is explained on CPP Reference. If you customize the copy or move constructor, the corresponding
= default
assignment operator will be wrong. Be careful! Note how the example code does not use= default
for the assignment operators, even though the assignment operators contain no logic.
“Modern” C++#
After seeing comments on Hacker News, I felt compelled to add this section. Every time someone complains about anything in C++, someone will mention a newer version of C++ that fixes it. These “fixes” are usually not that good, and only feel like fixes if you’re used to everything being kind of clunky.
Here’s why:
- The default way still is the old, bad way. For example, capturing
lambdas by move should be the default, and
std::move_only_function
, coming soon in C++23, should have been the defaultstd::function
. - For that reason, and because there’s never warnings enabled on the old, bad way, even new coders keep doing things the bad way.
Of course, I understand that this is important for backwards-compatibility. But that is the entire problem: C++ has too many bad decisions accumulated. Why was copying the default for parameter passing collections, let alone for lambda capture? I know the historical reasons, but that doesn’t mean that a modern programming language should work that way.
Even C++11 couldn’t clean up the fact that raw pointers and
C-style arrays get nice syntax, while smart pointers and std::array
look terrible. Even C++11 couldn’t clean up that it was working
around a language designed without moves.
Conclusion#
Unfortunately, I am all too well aware of why these decisions were made, and it is exactly one reason: Compatibility with legacy code. C++ has no editions system, no way to deprecate core language features. If a new edition of C++ was made, it would cease to be C++ – though I support the efforts of people to transition C++ to new syntax and clean some of this stuff up.
However, if you ignore backwards-compatibility and the large existing codebases, none of these papercuts make the programming language more powerful or better, just harder to use. I’ve seen good-faith arguments in favor of human-maintained header files, surprising as that is to me, but I challenge my readers to tell me what is beneficial about C++’s design choices in these matters.
You might find these things trivial, but these all slow programmers down, while simultaneously annoying them. If you are experienced enough, your subconscious might be adept at navigating it, but imagine what your subconscious could do if it didn’t have to. But how adept are you at seeing these mistakes in a code review from your junior colleagues? If you are a rigorous reviewer, how much more time does it take? How adept are you at finding these issues quickly when a bug arises?
We’d be more effective, more efficient, and happier if these issues were resolved. Programming would be both enjoyable and faster to do. What’s the downside? The only upside is continuity with history. And while I can see the value in that, it is a very limited value, with very limited scope.
Newsletter
Find out via e-mail when I make new posts!