Defaulting to Zero

The activity of programming is sometimes divided into high-level system architecture and low-level implementation — the idea being that an experienced programmer can create a high-level design of a system and then hand it over to a more junior programmer for coding. Thus, we get two separate worlds, one of design specifications and UML diagrams and one of actual programming.

I’ve always found this divide to be artificial. To me programming is all design. Even in the low level implementations we’re constantly picking approaches, finding solutions, choosing to do things one way or another, based on performance constraints, readability, extensibility or whatever other goals we are trying to achieve. Fully designing a a system is the same thing as writing the code for it. It’s architecture all the way down.

Programming is a kind of special activity — it lives entirely in a virtual world. We build virtual artifacts with tools that are themselves virtual and can be extended. In “real world engineering” there are real physical things to be done, such as hammering in two billion rivets. In the virtual world, any such trivial, repeatable task can be automated with a single line of code. Thus, it is the blessing and the curse of programmers to always work in the realm of the non-trivial. A blessing, because it keeps our minds occupied and our jobs secure. A curse, because there is always a risk of failure or not delivering on time. Note that non-trivial does not necessary mean hard, it might just as well mean messy, not well specified or intractable.

I often find that when I work on the low level implementation of something I discover ideas that I can bring back and use to inform the high level design — to make it easier to work with, more performant, more orthogonal, etc. Thus, the flow goes back and forth from high-level to low-level, instead of just in one direction. If I’m stuck in some part of the high-level design, starting to work on the implementation is often the best way to get unstuck.

One such idea, which is really simple, but tends to lead to better and simpler code, is the idea of defaulting to zero. I.e. to always use 0 as the default or nil value. For example, sometimes you see code like this:

// Returns the index of the bone with the specified name, or
// 0xffffffffU if the bone couldn't be found.
uint32_t find_bone(const char *name);

Anyone calling this function must now write something like:

uint32_t bi = find_bone("ulna");
if (bi != 0xfffffffU) {
    float f = bone_length(bi);
    ...
}

This is not only hard to read and error prone (can you spot the error in the code above?), but also brittle. For example, if you later need to extend the bone array to support an uint64_t all tests have to be rewritten. There are ways around that, for example, you could introduce:

enum {NO_BONE = 0xffffffffU};

But a better solution, in my opinion, is to just let 0 mean “no bone”. Now you don’t need to test against a magic number or enum and can instead just write:

if (bi) 

And a default initialized bone variable will now have the meaning “no bone”, which is a lot better than having it mean “whatever bone happens to be at index 0, or an invalid value in case there are zero bones in the model”.

In structs, using zero as the default value for members means that you can default initialize the entire struct with just:

struct bone ulna = {0};

And in modern C you can use the designated initializer construct to assign just a few specific fields of the struct to non-default values:

struct bone ulna = {
    .name = "ulna",
    .length = 1.0f;
};

The fields not specified in the designated initializer will be zero-initialized.

By the way, as a neat trick in modern C (not C++, unfortunately), you can use designated initializers in function calls too. So if you have a function that needs a gazillion different parameters, you can make it take a struct pointer instead, and the caller can use designated initialization to pass a subset of the parameters in arbitrary order:

draw_circle(&(struct circle){.x=10f, .y=10f, .color = {.red=255},
     .radius=5f});

This is both more readable than having tons of parameters and easier to write for the caller.

In enums, just as with the bones, I find it useful to reserve the 0 slot for a “nil”-value:

enum controller_type {
    CONTROLLER_TYPE_NONE,
    CONTROLLER_TYPE_MOUSE,
    CONTROLLER_TYPE_KEYBOARD,
};

Having a “nil” value for an enum can be useful in a lot of situations. For example, if the user asks for the type of a non-existing controller we can return CONTROLLER_TYPE_NONE instead of having to return a separate error value. Also, default initialized values of the enum type get a more well-defined value rather than randomly being assigned whatever constant comes first in the enum (CONTROLLER_TYPE_MOUSE in the example above). All in all, it leads to fewer “special cases” and clearer, more straight-forward code flow.

Another thing that I find very useful to avoid special cases is to “reserve” the zero-slot in arrays. For example, consider our system for keeping track of bones. To reserve the zero-slot we would do something like:

struct bone bones[MAX_BONES];
uint32_t num_bones;

// At initialization:
bones[num_bones++] = (struct bone){0};

Let’s break this down. bones[num_bones++] = is just the C way of doing vector::push() — we add an element at the end of the array and increase the count. And (struct bone){0} is the C way of creating a default (zero-initialized) item. So we create a default bone and put it at slot 0 in the array.

Here’s another way of doing the same thing, this time using dynamic array instead of a statically sized one:

struct bone *bones = &(struct bone){0};
uint32_t num_bones = 1;

This means that we can use the 0 index to mean “no bone”, because it is occupied by this dummy bone. All the actual bones will have non-zero indices. But in addition, we can avoid treating this non-existing bone as a special case, because we actually have some default data in this slot. So calling bone_length(0) for example is perfectly legal — it will just return the default bone length of 0.0f.

This let’s us get rid of a lot of special cases and if-branches in the code. For example, consider our code in the beginning:

uint32_t bi = find_bone("ulna");
if (bi != 0xfffffffU) {
    float f = bone_length(bi);
    ...
}

With this change, we can rewrite this as just:

float f = bone_length(find_bone("ulna"));

It will give us the length of the ulna if it exists, and 0.0f otherwise.

This may seem like a very minor improvement to the code, but I don’t think it should be discounted. Small improvements applied all over the codebase will build up to a significant impact. As programmers, complexity is our biggest enemy and we should always strive to find tools to simplify our code.

by Niklas Gray