Enumerating values

Sometimes when programming you have a type which can be a small, finite number of values. For example, you might have noticed that in C, you need to #include <stdbool.h> to have access to the bool type and the true and false values.

However, suppose you wanted to define this type yourself. The syntax for this would look like:

typedef enum bool {
    false,
    true,
} bool_t;

In other words, we’re enumerating the values that bool can take on: false, and true. true and false here are called “bool’s variants.”

It turns out that “enumerate” is exactly the right word: each of these constants is assigned an integer value, starting from 0 and going forward. Correspondingly, this defines false = 0 and true = 1. Note that this means the definition of an enum is order dependent. enum { true, false } would be an incorrect definition for bool.

An important thing to keep in mind, however, with enum types is that they are actually just int. Notably,

bool x = 3;
printf("%d\n", x);

would print 3, even though 3 is not one of the enumerated values. (Note that this is not actually true of the bool type in stdbool.h, which is defined using the _Bool keyword. The actual bool type can only take values 0 and 1 and assigning any other value gets converted to 1.)

Enums, then, are effectively a collection of an alias for the int type and named constants for that type. However, this bundling is good for communicating intent.

If you were implementing a chess engine, you might want to enumerate the chess pieces:

typedef enum piece_type {
    PIECE_NONE,
    PIECE_PAWN,
    PIECE_KNIGHT,
    PIECE_BISHOP,
    PIECE_ROOK,
    PIECE_QUEEN,
    PIECE_KING
} piece_type_t;

(Convention is that enum variants, like constants, should be written in SCREAMING_SNAKE_CASE. You generally also want a common prefix to indicate that they are part of the same collection and prevent namespace issues.)

And perhaps you also want a nice way to refer to the colors:

typedef enum color {
    COLOR_WHITE,
    COLOR_BLACK
} color_t;

Assigning values

In chess, pieces are commonly assigned numerical “material values” and used to evaluate who has the advantage. To help with this, enums do have a couple more tricks up their sleeve. Since they’re just integers, you can do arithmetic on them!

And furthermore, they can be given values.

typedef enum piece_type {
    PIECE_NONE, // the first value is 0 if not given an explicit number
    PIECE_PAWN, // subsequent values increase by 1, so this is 1
    PIECE_KNIGHT = 3,
    PIECE_BISHOP = 3,
    PIECE_ROOK = 5,
    PIECE_QUEEN = 9,
    PIECE_KING, // similarly, this is 10 since it follows 9
} piece_type_t;

typedef enum color {
    COLOR_WHITE = 1,
    COLOR_BLACK = -1
} color_t;

If we represented the board as:

typedef struct piece {
    piece_type_t type;
    color_t color;
} piece_t

piece_t board[64];

We could then tally up the total material as:

int32_t tally_material(piece_t board[64]) {
    int32_t tally = 0;
    for (size_t i = 0; i < 64; i++) {
        tally += board[i].type * board[i].color;
    }
    return tally;
}

In the above code snippet, we simply use type and color as integers.

Working with enums

Internally, we represent our pieces using integers, because computers are very efficient at working with integers. Humans, on the other hand, are much better at working with text. Suppose that, for debugging, we want to be able to print the name and color of a piece.

We might want to define functions which take in a color or piece type and give a string representation. It’s easy enough to write this for the color:

char *color_name(color_t color) {
    if (color == COLOR_WHITE) {
        return "white";
    } else {
        return "black";
    }
}

However, if we start writing this for the type, we end up with something like:

char *piece_type_name(piece_type_t type) {
    if (type == PIECE_NONE) {
        return "none";
    } else if (type == PIECE_PAWN) {
        return "pawn";
    } else if (type == PIECE_KNIGHT) {
        return "knight";
    } else // and so on
}

This would work, but it’s verbose, clunky, feels like there should be a better way.

Enter, switch

A switch-case statement is, in many ways, complementary to an enum. We could use it to write our function as so:

char *piece_type_name(piece_type_t type) {
    switch (type) {
        case PIECE_NONE: {
            return "none";
        }
        case PIECE_PAWN: {
            return "pawn";
        }
        case PIECE_KNIGHT: {
            return "knight";
        }
        case PIECE_BISHOP: {
            return "bishop";
        }
        case PIECE_ROOK: {
            return "rook";
        }
        case PIECE_QUEEN: {
            return "queen";
        }
        case PIECE_KING: {
            return "king";
        }
        default: {
            assert(false && "Invalid piece type!\n");
        }
    }
}

Switch statements take an integer-like value (char, int, an enum, any (u?)int[N]_t, etc.) as their argument (in this case, the argument is type). They then have some number of “cases.” You then make cases with case [CASE]: where [CASE] is an integer-like constant. The cases of a switch-case statement must be constants, literals, or enum variants. They cannot be variables. You cannot operate a switch statement on strings, but you can operate one on a single character and cases can be character literals like case 'a':.

Finally, there is an optional default case which handles any value which doesn’t match the other values—think of it like an else branch in an if-else chain. When you are exhaustively matching on every variant of an enum, it is good practice to add a default case which crashes (e.g., with assert(false), since this will give you a line number if it’s ever hit) and potentially prints the value it received. This will help you debug in the event you mess something up.

There is one VERY IMPORTANT PITFALL to keep in mind when using switch statements: when switch jumps to a case, the code just continues to run from there! In other words, the code:

void print_it_wrong(uint32_t x) {
    switch (x) {
        case 0: {
            printf("%d\n", 0);
        }
        case 1: {
            printf("%d\n", 1);
        }
        case 2: {
            printf("%d\n", 2);
        }
        case 3: {
            printf("%d\n", 3);
        }
        default: {
            printf("%s\n", "I don't know numbers that big!");
        }
    }
}

print_it_wrong(1);

Will output:

1
2
3
I don't know numbers that big!

This behavior is called “fallthrough”. It can be useful, but usually it is undesirable. To avoid it, every case should be terminated by a break or return statement.

The correct way to write the function above is:

void print_it(uint32_t x) {
    switch (x) {
        case 0: {
            printf("%d\n", 0);
            break;
        }
        case 1: {
            printf("%d\n", 1);
            break;
        }
        case 2: {
            printf("%d\n", 2);
            break;
        }
        case 3: {
            printf("%d\n", 3);
            break;
        }
        default: {
            printf("%s\n", "I don't know numbers that big!");
        }
    }
}

Written this way, print_it(1) just outputs 1.

It’s worth noting that case statements do not strictly need to be followed by {}. Doing so will help avoid some surprises related to local variables and generally makes your code more readable, but there are cases where it makes sense to avoid it. The most common example when you’d want to avoid them is when intentionally utilizing fallthrough.

void print_it_lazy(uint32_t x) {
    switch (x) {
        case 0:
        case 1:
        case 2: {
            printf("%s\n", "It's less than 3!");
            break;
        }
        case 3: {
            printf("%d\n", 3);
            break;
        }
        default: {
            printf("%s\n", "I don't know numbers that big!");
        }
    }
}

Here, since we didn’t put break after the 0 or 1 cases, each of them will “fallthrough” to the 2 case.

You may also choose to omit curly braces when your statements don’t involve any variable declarations.

void print_it_also_fine(uint32_t x) {
    switch (x) {
        case 0:
        case 1:
        case 2:
            printf("%s\n", "It's less than 3!");
            break;
        case 3:
            printf("%d\n", 3);
            break;
        default:
            printf("%s\n", "I don't know numbers that big!");
    }
}

However, you should be cautious. Consider the following code:

void broken(uint32_t x) {
    switch (x) {
        case 0:
        case 1:
        case 2:
            int y = x;
            printf("It's less than 3! It's %d.\n", y);
            break;
        case 3:
            printf("%d\n", 3);
            break;
        default:
            printf("%s\n", "I don't know numbers that big!");
    }
}

If you tried to compile it, you would get error: expected expression on the int y = x line. This is because a case label, due to quirks of the C language, cannot be immediately followed by a variable declaration. Curly braces will solve this issue.

Finally, there’s a shorthand for a sequence of consecutive cases:

void print_it_also_fine(uint32_t x) {
    switch (x) {
        case 0 ... 2:
            printf("%s\n", "It's less than 3!");
            break;
        case 3:
            printf("%d\n", 3);
            break;
        default:
            printf("%s\n", "I don't know numbers that big!");
    }
}

The syntax case n ... m: is equivalent to

case n:
case n + 1:
...
case m - 1:
case m:

It can also be used for characters—e.g., case 'a' ... 'z': would match any lowercase letter—or enums (though you should make sure that the enums have consecutive numerical values, rather than appearing consecutively in the definition).

Recap

An enum defines a type alias for int and a collection of constants. They are useful when you want a type representing something which takes on a concrete list of values. You can explicitly assign values to enum variants or omit them, in which case they will be 0 if they’re the first variant or the previous variant plus 1 if they are not. Enum types should be named like types, in snake case and, for the purpose of this class, suffixed with a _t when they are typedefed. Enum variants should be named in SCREAMING_SNAKE_CASE like constants and prefixed with a common prefix. The syntax is:

typedef enum the_name {
    VARIANT_1, // = 0
    VARIANT_2 = 5,
    VARIANT_3 // = 6
} the_name_t;

A switch-case statement allows you to select a case based on the value of the switch’s argument (which is a numeric type), jumping to that code. Control flow then continues directly down from the case jumped into, falling through to cases below it. To avoid this, which you usually want to, a break or return statement should be placed at the bottom of each case. Optionally, the switch statement has a default case which handles everything not matching a case, and it is good practice to use it to catch unexpected values. switch syntax is fairly convoluted, but the below demonstrates most of the things you can do:

void example(uint32_t x) {
    switch (x) {
        case 0 ... 2:
        case 3: {
            int y = x;
            printf("It's less than 4! It's %d.\n", y);
            break;
        }
        case 4:
            printf("%d\n", 4);
            break;
        default:
            printf("%s\n", "I don't know numbers that big!");
    }
}

It is good practice to use curly braces unless you have a reason not to, such as being clearer when intentionally falling through to the next case (especially in empty cases like the example above).