The Haskeller’s Hungarian Notation
When I was first learning to program, a long time ago, it was in
BASIC, and you had to annotate your variable names to indicate what
type something is. foo
would be a number, whereas foo$
would be a
string. This meant that there could only be as many types of information
as there were symbols to put after your variable, but that was okay for
the sort of programming BASIC was used for. These were called sigils,
and they helped you keep straight in your head what was going on +++
and made it easier for the computer too. Any aggregates had to be
explicitly declared.
Later on, I learned Perl, which had a similar system, but with a twist.
A variable named $foo
could contain a number or a string — or even
some sort of object or reference — but it could only contain one
of them. It was a “scalar.” @foo
would contain many scalars with
indices in an array, and %foo
would contain many with string or other
keys in a hash map. The computer kept track, dynamically, of the
practical types of the scalars, and could easily do the same
for the aggregate types, but chose to instead enforce a mechanism
where the programmer would be reminded of whether it was a single
value or some sort of aggregate that was being discussed.
In Haskell terms, BASIC had you use sigils for data types, but Perl
had you use sigils for functors. And not to make people too upset by
comparing Haskell and Perl, but Haskellers regularly do the same today,
voluntarily annotating variable names with the functors by convention.
For example, dmdMenuItems
might translate, in a Reflex codebase, to
Dynamic
of Maybe
of Dynamic
of list of DomElement
.
The usage originally struck me as quite strange, and I didn’t like it.
I remember thinking the original Hungarian notation was redundant:
int iFoo;
literally says int
right before it. And besides, wasn’t the
point of a type system to not need extra mnemonics, because the compiler
will stop you from messing things up?
At my previous job, we used prefixes like m_
and g_
in C++ to
indicate scope (member variable/field and global, respectively), and it
similarly took me a while to adapt. In those situations, it turned out to
help because the sigils told you where to look for more information. If
there wasn’t a m_
, you looked in the same function, but otherwise you
had to immediately go to the class declaration. But that wasn’t the
only advantage. What scope something was in was important in how you
treated the variable, in many subtle ways that would be bad to confuse,
and which the compiler in C++ wouldn’t really help you with.
Similarly, in Haskell, indicating what functor something is in tells you
something important: What kinds of things can you do to get a regular
value out of it? Do you need to provide a default value (Maybe
) or
only provide it to versions of functions adapted for it (Dynamic
)
or perhaps just keep the functor around while transforming the
values inside ((<$>)
, and (<$$>)
, and (<$$$>)
…where which
one depends on how many functors). And while the compiler will
help us with this, it’s something it’s convenient to see all the
time, and the types of each individual variable are sometimes
inferred and always not immediately visible in every usage.
And when we do write the pure function or the lambda or the fromMaybe
or the dyn_ $ ffor ...
, what variable do we name it now? Many times
we have many variables with the exact same semantic role, the only
difference being what functors they’ve been wrapped with. We want to say
ffor dSelectedId $ \selectedId -> ...
or
fmap (\number -> number + 1) eNumber
or
let fish = fromMaybe defaultFish mFish
. The alternative is, what,
judicious use of '
for the different but analogous variables? The
difference between these variables, intuitively, is how wrapped up
in functors they are, and that should also be the difference in
their names.
And I’ve decided this is a good thing. Conventionalized terseness is the least
problematic type of terseness. Single-letter abbreviations are
great if it communicates information efficiently and everyone
agrees on what they mean. I’ve seen dyn
and may
as well,
and I prefer d
and m
, as they are easier to stack up without
getting too unwieldy, and besides, dyn
is used for functions and
may
is also a verb (does mayFish
mean something that’s a Maybe Fish
or a boolean about whether you are permitted to fish?)
And so, in spite of my initial skepticism, I’ve come to like this naming convention, and I recommend it to all of you as well.
Subscribe
Find out via e-mail when I make new posts! You can also use RSS (RSS for technical posts only) to subscribe!
Comments
If you want to send me something privately and anonymously, you can use my admonymous to admonish (or praise) me anonymously.
comments powered by Disqus