The Haskeller’s Hungarian Notation
When I was first learning to program, a long time ago, it was in
BASIC, and you had to annotate your variable names to indicate what
type something is. foo would be a number, whereas foo$ would be a
string. This meant that there could only be as many types of information
as there were symbols to put after your variable, but that was okay for
the sort of programming BASIC was used for. These were called sigils,
and they helped you keep straight in your head what was going on +++
and made it easier for the computer too. Any aggregates had to be
explicitly declared.
Later on, I learned Perl, which had a similar system, but with a twist.
A variable named $foo could contain a number or a string — or even
some sort of object or reference — but it could only contain one
of them. It was a “scalar.” @foo would contain many scalars with
indices in an array, and %foo would contain many with string or other
keys in a hash map. The computer kept track, dynamically, of the
practical types of the scalars, and could easily do the same
for the aggregate types, but chose to instead enforce a mechanism
where the programmer would be reminded of whether it was a single
value or some sort of aggregate that was being discussed.
In Haskell terms, BASIC had you use sigils for data types, but Perl
had you use sigils for functors. And not to make people too upset by
comparing Haskell and Perl, but Haskellers regularly do the same today,
voluntarily annotating variable names with the functors by convention.
For example, dmdMenuItems might translate, in a Reflex codebase, to
Dynamic of Maybe of Dynamic of list of DomElement.
The usage originally struck me as quite strange, and I didn’t like it.
I remember thinking the original Hungarian notation was redundant:
int iFoo; literally says int right before it. And besides, wasn’t the
point of a type system to not need extra mnemonics, because the compiler
will stop you from messing things up?
At my previous job, we used prefixes like m_ and g_ in C++ to
indicate scope (member variable/field and global, respectively), and it
similarly took me a while to adapt. In those situations, it turned out to
help because the sigils told you where to look for more information. If
there wasn’t a m_, you looked in the same function, but otherwise you
had to immediately go to the class declaration. But that wasn’t the
only advantage. What scope something was in was important in how you
treated the variable, in many subtle ways that would be bad to confuse,
and which the compiler in C++ wouldn’t really help you with.
Similarly, in Haskell, indicating what functor something is in tells you
something important: What kinds of things can you do to get a regular
value out of it? Do you need to provide a default value (Maybe) or
only provide it to versions of functions adapted for it (Dynamic)
or perhaps just keep the functor around while transforming the
values inside ((<$>), and (<$$>), and (<$$$>)…where which
one depends on how many functors). And while the compiler will
help us with this, it’s something it’s convenient to see all the
time, and the types of each individual variable are sometimes
inferred and always not immediately visible in every usage.
And when we do write the pure function or the lambda or the fromMaybe
or the dyn_ $ ffor ..., what variable do we name it now? Many times
we have many variables with the exact same semantic role, the only
difference being what functors they’ve been wrapped with. We want to say
ffor dSelectedId $ \selectedId -> ... or
fmap (\number -> number + 1) eNumber or
let fish = fromMaybe defaultFish mFish. The alternative is, what,
judicious use of ' for the different but analogous variables? The
difference between these variables, intuitively, is how wrapped up
in functors they are, and that should also be the difference in
their names.
And I’ve decided this is a good thing. Conventionalized terseness is the least
problematic type of terseness. Single-letter abbreviations are
great if it communicates information efficiently and everyone
agrees on what they mean. I’ve seen dyn and may as well,
and I prefer d and m, as they are easier to stack up without
getting too unwieldy, and besides, dyn is used for functions and
may is also a verb (does mayFish mean something that’s a Maybe Fish
or a boolean about whether you are permitted to fish?)
And so, in spite of my initial skepticism, I’ve come to like this naming convention, and I recommend it to all of you as well.
Subscribe
Find out via e-mail when I make new posts! You can also use RSS (RSS for technical posts only) to subscribe!
Comments
If you want to send me something privately and anonymously, you can use my admonymous to admonish (or praise) me anonymously.
comments powered by Disqus