Long Division

Integer types with at least 64 bits have been a part of the C standard for a while now (they were added in C99, and were a standard extension in many 32-bit compilers before that). But have you ever wondered what exactly happens when you use them?

Consider the following function (substitute long long with __int64 if you are using an older version of Visual C++):

let’s first have a look at what the VC 64-bit compiler gives us:

Pretty much what you would expect, a little setup and an idiv instruction to perform the division. Now let’s try the VC 32-bit compiler:

A little setup and .. a call?

The thing is, 32-bit x86 only has instructions for mixed mode 32/64 bit operations — with one assembly instruction you can multiply two 32-bit integers to get a 64-bit result, or you can divide a 64-bit integer by a 32-bit as long as the quotient fits in 32 bits. But to divide two 64-bit integers, you have to emulate the operation in software.

That is where the _alldiv function comes in. It is a support function in the C runtime library that performs a 64-bit divide. In fact there is a whole family of such functions to do multiply, divide and shifts on 64-bit numbers (addition and subtraction are inlined since they only require two assembly instructions).

Other compilers do likewise, for instance GCC calls __divdi3 to perform 64-bit division.

It is important to be aware of this, because using long long instead of int means that arithmetic operations you probably expect to be single instructions, all of a sudden become library calls who’s execution time can depend on the arguments.

As an example, I timed a simple implementation of the extended Euclidean algorithm with 32- and 64-bit integers. Compiling with the VC 32-bit compiler, the 64-bit version was roughly 3 times slower (depending on the CPU).

2 thoughts on “Long Division”

  1. Thank you, I know the updates here are rather infrequent, but don’t worry I will post whenever I find something I think is interesting enough.

    Btw, I found a quote by Jan Gray that nicely underlines my comment about the programmer expecting arithmetic operations to map to single instructions:

    It was easier in the good old days. Good C programmers knew. Each operator and operation in C, be it assignment, integer or floating-point math, dereference, or function call, mapped more or less one-to-one to a single primitive machine operation. True, sometimes several machine instructions were required to put the right operands in the right registers, and sometimes a single instruction could capture several C operations (famously *dest++ = *src++;), but you could usually write (or read) a line of C code and know where the time was going. For both code and data, the C compiler was WYWIWYG—”what you write is what you get”. (The exception was, and is, function calls. If you don’t know what the function costs, you don’t know diddly.)

Leave a Reply