What Bits Mean: Meta-Data and Static Typing
This is part of my new series on what the 0’s and 1’s in computers mean, how computers use them to store various kinds of information, and why all of this works the way it does.
When I was a boy, my schoolmates, knowing that I was interested in computers, would sometimes ask me if I could read binary. They imagined I would see some binary, and be able to read it out loud like they could read letters, perhaps some binary that looked like this:
01000111
01001001
01000110
01010100
I’m not sure how I handled this situation as a boy – I’m sure it was plenty awkward and convoluted because my memory of it is blanked out. But I have a question in response now, and I offer it to you, my reader: Do you know how to read letters?
Perhaps, if you do, you can tell me what this sequence of letters means. I will tell you that I saw it written on a mysterious bottle of mysterious liquid:
GIFT
Now, perhaps you are very confident you know. But perhaps you want to ask a follow-up question. Because that sequence of letters can mean “present, item that has been given to you, free distribution of a good” – if we are assuming it is an English. If we instead assume it is a German word, well, then it means “poison.” Very different. (And perhaps in either case we shouldn’t drink mysterious liquids, even out of mysterious bottles that are only hypothetical – perhaps especially out of ones that are only hypothetical.)
But yes, letters are symbols, but they only have meaning in the context of a language to interpret them. The same series of symbols can mean two different words in two different languages.
Similarly, the binary I listed above could have different interpretations,
depending on what type the data has. If interpreted as text with
an ASCII character encoding, it says “GIFT
” (with no indication,
of course, whether that means poison or present). If interpreted as a
32-bit unsigned integer in little endian (increasing addresses from the
top of the screen to the bottom), it is 1413892423
.
Now, like with language in most situations, (especially ones that don’t
involve mysterious bottles), we can use context clues to guess that
it is more likely that I, Jimmy Hartzell, the author of (or at least
the poster of) those bits, chose them to represent the word GIFT
rather than the number 1413892423, a number with no relevance to the
price
of tea in China.
But computers can’t use context cues, certainly not in a probabilistic,
critical-thinking based way. Or at least, traditionally they can’t! And
they certainly can’t at the speed and reliability needed to do their
normal day-to-day work. Computers need determinism! They need mechanisms
guaranteed to tell them whether those bits written above, those 1’s and
0’s were ASCII text spelling GIFT
or a (32-bit unsigned little endian
integer) number, specifically 1413892423, or some other interpretation,
like an 8 pixel by 4 pixel black and white image, or perhaps just garbage
that just happened to be in unallocated memory, ready to be overwritten
by something more useful.
Now, there are myriad ways that computers accomplish this. It differs by computer platform and operating system and programming language. But some of the simpler ones are familiar to any computer user.
One way of figuring out what interpretation to use for bits is meta-data – bits that are interpreted to mean things about how to interpret other bits. You may have heard the term meta-data before, and you certainly know some examples.
Meta-data is like the labels on a form. Here is an example form without labels:
Jimmy
Hartzell
Male
Pennsylvania
United States of America
If you see this form, you can probably guess that it provides my given name, surname, gender, state of residence, and country of citizenship.
But some people are named Virginia, and some people live in the state of Virginia, so there’s always room for confusion! And from Dune I’ve learned that at least some fictional people have the word “Idaho” attached to them, and it’s not a state but a surname. For these and other reasons, in practice, bureaucracies (which like computers have an allergy for confusion and a need for objective, consistent processes that they will follow against any and all opposing forces of common sense) use labels on their forms:
Given Name: Jimmy
Surname: Hartzell
Gender: Male
State of Residence: Pennsylvania
Nationality: United States of America
Even so, these labels are only useful if you know how to read the language. Even meta-data has to have some interpretative lens. Additionally, oftentimes, a bureaucratic form becomes invalid (and gets rejected by the authorities) if you start moving fields around, or start adding your own form. If I renewed my driver’s license, but decided to draw up my own form, it would be rejected, even if my meta-data were abundantly clear and it had all the data they wanted:
Favorite Color: Blue
Musical Instruments: Piano, recorder, trombone, vocals
Given name: Jimmy
Nationality: United States of America
Surname: Hartzell
State of Residence: Pennsylvania
State of Mind: Happy
Depending on what kind of computer system you’re dealing with, the computer might or might not mind adding additional fields – also depending on how fields are defined and how the meta-data is structured and what the format is for combining the meta-data with the data. It’s all quite complicated.
One common type of meta-data is file extensions. A file with a name ending
in .docx
is a Word Document, and when you (in this scenario perhaps
you are a Microsoft Windows™ user) double-click on it in Windows’s
file management program (is it still called Windows Explorer?), the
program Microsoft Word™ will load to open it. If you name any
old file to say .docx
, it will still try to open it in Word,
and then Word will yell at you that it can’t open it. (Oddly
enough, if you rename it to say .zip
instead, it will unzip
just fine – Word documents are also zip files.)
How’s Windows know to open Word? Why’s it do it even if it’s not a valid
Word document? It’s the extension. But not only the extension! It has
configuration in the registry (at least it did at one time – do they
still use a registry?) that associates the extension .docx
with Word.
Hopefully, that was the intention of the person who created the file,
but you would imagine it is, otherwise they wouldn’t have named it
that.
But even this convention depends on the context of the registry, not to mention the whole NTFS filesystem that Windows is probably using to tell which parts of the hard drive correspond to which named files in which folders.
You could also imagine a system where there were no file extensions and no file metadata. If you wanted to open a Word document, you would have to open Word first, and with it select a file to open. It would then try to open it as a Word document, and either you’d get something sensical or not depending on whether you were right about what program to use for that file. The onus would then be on the user for what program to use to open what file.
Perhaps the user could use their own metadata system, and have a Word
document that they remember is a Word document, in which they write
which program to use to open which file. Or perhaps the user can try
different programs until they find one that makes sense. Perhaps
the user can use specialized but ultimately fallible tools like
file
to see if there
are any (relatively rigorous and consistent) clues as to the file type.
Or the user may simply remember inside their own memory.
All of this is complicated, but that’s the world we live in. Symbols don’t have intrinsic meaning, and there is no inherent right language or right way to speak any language. There is no one way to read binary, and it is even more complicated than this essay implies, or than you might ever have guessed.
This extends into programming languages. In Python, variables
have no type. You can use the same variable foo
and put
a number like 33
or text like "GIFT"
into it. If you try
to do an operation that doesn’t make sense, you get an error
when you reach that operation, but not beforehand.:
import random
if random.randint(0,1) == 0:
foo = "Hi"
else:
foo = 33
print(foo)
print(foo + 1)
Half the time, this prints 33 and then 34. The other half, it prints
“Hi” and then outputs an error message. Python is using meta-data to
keep track of whether foo
is a number or a string. That meta-data is
in a format that makes sense to the Python interpreter, and allows the
Python interpreter to inspect foo
to see what type it is. If foo + 1
makes sense given that type, it does it. If it doesn’t, it displays
an error on the spot.
This prevents it from misinterpreting data. The text “GIFT” will never be misinterpreted as the number 1413892423, because it won’t have the right meta-data. Any Python code that works on numbers will instead show an error message if the wrong meta-data is present.
What about a language like Rust? Rust also keeps track of types, but it does so without using meta-data like this. Rust takes your Rust program, and converts it into machine code that runs directly on your computer, a process known as compilation. That machine code is a series of instructions that are guaranteed to respect type safety (as long as you either don’t use unsafe Rust features or else only use them according to the strict rules Rust requires), so that if you write data interpreted as a number, the data is also read as a number.
Once the program is running, it doesn’t use meta-data to accomplish this. Instead, it is more like the user who knows to open Microsoft Word before opening a Word document. The instructions know to do operations on the right values. If they load a memory address to do math on it, it is because that memory address is known to the Rust compiler to be the type of data that math can be used for.
In this way, Rust is like a clever programmer who only writes correct
code. If they store an integer in address 0xffffd9718c6c
, and they
load that value later, the programmer will remember in their brain that
they should expect it to be stored as an integer. The resulting program
works because the programmer wrote it in such a way that it would work,
even though this information isn’t written down anywhere, because
it uses addresses consistently.
The same is true of programs compiled by the Rust compiler. Once the compiler is done, it is not written down anywhere what type a variable has. At a computer level, the program is just written in such a way as to use data consistently.
This is more efficient, as Rust programs don’t need to take up extra memory for the meta-data. However, it does mean that the Python program we wrote above won’t work in Rust. We can’t even compile a program that tries to set a variable to two values of different types. There’s nowhere to write down the type information.
Let’s try to write an equivalent Rust program and see what happens.
use rand::Rng;
fn main() {
let mut rng = rand::thread_rng();
let test = rng.gen();
let foo;
if test {
foo = 33;
} else {
foo = "Hi";
}
println!("{foo}");
}
In this case, you get an error:
Compiling TypePun v0.1.0 (/home/jim/hobby/TypePun)
error[E0308]: mismatched types
--> src/main.rs:10:15
|
6 | let foo;
| --- expected due to the type of this binding
...
10 | foo = "Hi";
| ^^^^ expected integer, found `&str`
That makes sense, because Rust is keeping track of what
type foo
is supposed to be, so it can use it consistently.
It can’t vary from run to run of the program, because that
information isn’t written down anywhere. The value of foo
can vary, of course – it wouldn’t be a good variable if
it couldn’t – but the type, the interpretation
of foo
’s bits, cannot.
Of course, Rust can do everything Python can. In this case, you could tell Rust yourself to use a new type that uses meta-data to keep track of what type an inner value is. You can even do the math on it if it’s a number.
It gets complicated fast, since you have to define a
new type, here StringOrInt
, that indicates how to
not only interpret the data in the value, but also
the meta-data of what type of value it is. That outer
type, however, is not stored in the resulting program
as meta-meta-data.
use rand::Rng;
fn main() {
let mut rng = rand::thread_rng();
let test = rng.gen();
enum StringOrInt {
String(String),
Int(u32),
}
let foo;
if test {
foo = StringOrInt::Int(33);
} else {
foo = StringOrInt::String("Hi".to_string());
}
match foo {
StringOrInt::Int(foo) => {
println!("{foo}");
println!("{}", foo + 1);
},
StringOrInt::String(foo) => {
println!("{foo}");
},
}
}
If you were to write a Python interpreter in Rust, you would have to do
something like this for every variable, where you create a type that can
contain multiple inner types. This is only one example of a technique
that does this, where we created an enum
type, but there are others,
like “trait objects.” They all work according to similar principles:
Rust needs to know explicitly that you want meta-data to keep track of
additional information, and what style of meta-data you want.
Note that, in Rust, it still knows at compile-time whether +
is
an appropriate operation.
I mentioned something about safety earlier. You can get Rust to
violate its rules with unsafe
. This results in undefined behavior
in general, and so the results you get with unsafe
are not
guaranteed to be consistent. However, we can use this to demonstrate
what happens if Rust were to get its type information wrong.
fn main() {
let foo = "GIFT";
let foo_ptr: *const str = &*foo;
// Safety: This just is unsafe.
let foo_number = unsafe { *(foo_ptr as *const u32) };
println!("{foo_number}");
}
The key here is this line:
let foo_number = unsafe { *(foo_ptr as *const u32) };
This means something like this:
Rust, I know you’re keeping track of what types go with what memory addresses. I know
foo_ptr
is a memory address of text (*const str
means pointer tostr
, andstr
means text). But I want you to pretend it’s a pointer to an unsigned 32-bit integer (which is little endian on most machines, including the author’s Mac Book M1 which has an ARM64 processor), and read it according to that interpretation instead, letting me do operations appropriate to that interpretation.
And it prints, of course, on my machine:
1413892423
If we’d done println!("{foo}")
, we would’ve gotten:
GIFT
The same data is passed to println!
, but what it actually does is based
on the type of the data. Again, this type is not tracked explicitly in
the outputted machine code. Rust just makes sure that the machine code
is appropriate for types that make sense.
This mechanism that Rust uses is called static typing, where instead of using meta-data like Python does, Rust creates a program that does the right thing, or else rejects a program that does something nonsensical or incoherent (or else fails to reject it because you tell Rust you know what you’re doing is unsafe).
Static typing has many uses. It is primarily used to make sure that you
only do operations that make sense for the type you have. Some
operations do different things to different types – +
means
one hardware operation for an integer like u32
, and something
else for a floating point like f32
, and static typing also
keeps track of that. You can create new operations like that –
they are called polymorphic.
Static typing is also used to reject programs where that is not possible,
where you write according to one binary format and read from another,
unless you use unsafe
to override these checks. The resulting programs
would otherwise be incoherent and nonsensical, which could lead to memory
corruption, especially if the optimizer is involved, which assumes you’re
following the rules when it modifies the program to make it faster.
Static typing can also be used via creating custom types. These custom
types might mean specific things in a certain context, to distinguish
bits in more detailed ways than the built-in types do so. Are three
f64
values a color (red, green, and blue) or a coordinate in a
three-dimensional grid (X, Y, and Z)? Two types can be created
to distinguish them, beyond what the built-in type of f64
already
does:
struct ThreeDimensionalCoordinate {
x: f64,
y: f64,
z: f64,
}
struct Color {
red: f64,
green: f64,
blue: f64,
}
fn draw(coord: ThreeDimensionalCoordinate, color: Color) {
// ...
}
Now, if there are 3 f64
values in a row, we can use Rust’s
static typing system not to just track that they’re all f64
values,
but whether they together represent a color or a coordinate. Otherwise,
a user might accidentally mix them up calling the draw
function,
and the program might do something illogical.
So, static typing prevents incoherent code. It does it before you get a chance to run it, making it easier to catch bugs. And it makes it so you need less meta-data at run time (though some programming languages leverage both static typing and run-time type meta-data).
Subscribe
Find out via e-mail when I make new posts! You can also use RSS (RSS for technical posts only) to subscribe!
Comments
If you want to send me something privately and anonymously, you can use my admonymous to admonish (or praise) me anonymously.
comments powered by Disqus