INT_MIN

While adding a few header files to WCRT (a small C runtime library for Visual C++), I stumbled upon something that caught my interest.

INT_MIN in <limits.h> is a macro that expands to the minimum value for an object of type int. In the 32-bit C compilers I have installed at the moment, it is defined as:

#define INT_MIN     (-2147483647 - 1)

So what exactly is wrong with the integer constant -2147483648 ?

Well, firstly it is not an integer constant. Let’s see what the standard says:

“An integer constant begins with a digit, but has no period or exponent part. It may have a prefix that specifies its base and a suffix that specifies its type.”

You will notice there is no mention of a sign. So -2147483648 is in fact a constant expression, consisting of the unary minus operator, and the integer constant 2147483648.

This still does not explain why that expression is not used directly in the macro. To see that, we have to revisit the rules for the type of integer constants.

The type of an unsuffixed integer constant is the first of these in which its value can be represented:

C89 : int, long int, unsigned long int
C99 : int, long int, long long int
C++ : int, long int, long long int

The problem is that 2147483648 cannot be represented in a signed 32-bit integer, so it becomes either an unsigned long int or a long long int.

So we have to resort to a little trickery, and compute -2147483648 as (-2147483647 – 1), which all fit nicely into 32-bit signed integers, and INT_MIN gets the right type and value.

If you happen to look up INT_MIN in the standard you will see:

minimum value for an object of type int

INT_MIN                 -32767

Which brings up the question why isn’t it (-32767 – 1)?

Pretty much any computer available today uses two’s complement to represent signed numbers, but this hasn’t always been the case.

Since C was designed to work efficiently on a variety of architectures, the standard’s limits allow for using other representations as well.

I will end this post with a little (not quite standard conformant) example. Try compiling it with your favorite C compiler, and let us know if something puzzles you.

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <stdio.h>
#include <limits.h>
#include <float.h>
 
int main(void)
{
	if (-2147483648 > 0)     printf("positive\n");
	if (-2147483647 - 1 < 0) printf("negative\n");
	if (INT_MIN == -INT_MIN) printf("equal\n");
	if (FLT_MIN > 0)         printf("floating\n");
 
	return 0;
}
pixelstats trackingpixel
Posted in Programming | Tagged , , | 8 Comments

System Up Time

A while ago I set out to write a little tool that would show the time a system had been running since the last reboot. It seemed like something that should be fairly easy to do, but as it turns out, it isn’t entirely straightforward.

The first thing that came to my mind was the GetTickCount function. It returns the number of milliseconds since the system was started, which fits nicely. There is one problem of course, the value returned is a DWORD, and that limits the time it can handle to about 50 days.

Microsoft realized this as well, and added GetTickCount64, which returns a 64-bit value instead. Unfortunately it only works on Vista+.

Looking more closely at the documentation for GetTickCount we see:

“To obtain the time elapsed since the computer was started, retrieve the System Up Time counter in the performance data in the registry key HKEY_PERFORMANCE_DATA. The value returned is an 8-byte value. For more information, see Performance Counters.”

The performance counters are a nice idea. Basically they provide a homogeneous interface to a multitude of counters that give information about how well the operating system or an application, service, or driver is performing.

The recommended way to access the data is through the PDH interface. You access the counters by specifying a counter path; a string that describes the computer, object, counter, and instance you are interested in.

Looking through the list of counters by objects for the ‘System’ object, we find the ‘System Up Time’ counter, which is exactly what we need.

So I wrote a test application to open a query, add the counter “\System\System Up Time”, collect the data, and display it. It failed. Apparently my computer did not have a ‘System Up Time’ counter.

Reading over the documentation for the PDH interface again did not help, but after some searching I ended up at this help article:

“Performance Data Helper (PDH) APIs use object and counter names that are in the localized language. Therefore, applications that use PDH APIs should always use the localized string for the object or counter name specification.”

Since I was running a Danish version of Windows, that explained the problem!

Following the steps outlined there, I found that the ‘System’ object has index 2, and the ‘System Up Time’ counter has index 674. With these indices in hand, you can then call the PdhLookupPerfNameByIndex function to get the localized names. Using the localized path “\System\Computerperiode uden afbrydelser” gave the desired result.

The choice to make the paths use localized names makes it somewhat more involved to use these functions. Also, this should have been described much more clearly in the PDH documentation, since it is quite possible for a developer using English Windows to read over the documentation like I did, and use a hardcoded path for a counter. This will work nicely while testing, and then fail if someone with a localized Windows uses it.

As an example, let’s take a look at the PsInfo tool. It is written by Mark Russinovich, one of the people behind Sysinternals.com, a site that specializes in advanced system utilities and technical information. He is also a coauthor of the Windows Internals book, describing the inner workings of Windows operation systems.

Running PsInfo on my system I get:

 PsInfo v1.75 - Local and remote system information viewer
 Copyright (C) 2001-2007 Mark Russinovich
 Sysinternals - www.sysinternals.com

 System information for \\removed:
 Uptime:                    Error reading uptime
 ...

Could it be? let’s have a little peek inside PsInfo.exe:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
perform_query:
    push    esi
    lea     ecx, [esp+834h+szCounterPath]
    push    offset aSUT ; "\\\\%s\\System\\System Up Time"
    push    ecx
    call    sprintf?
    mov     ecx, [esp+83Ch+hQuery]
    add     esp, 0Ch
    lea     edx, [esp+830h+phCounter]
    push    edx
    push    0
    lea     eax, [esp+838h+szCounterPath]
    push    eax
    push    ecx
    call    PdhAddCounterW

Indeed, a hardcoded path to the counter using the English names.

Microsoft must have realized it could be a problem as well, because I found the function PdhAddEnglishCounter, added in Vista, which made me smile.

pixelstats trackingpixel
Posted in Programming | Tagged , | 7 Comments

Padding Trouble

When Intel expanded the 8086 architecture to 32-bit in 1985, they extended the 16-bit registers present to 32-bit registers. ax became eax, but it was still possible to use the low 16 bits of eax as ax just like before. Their choice was that performing operations on the low 16 bits did not change the high 16 bits of the register.

AMD expanded the 32-bit architecture to 64-bit in 2003. This was again a superset of the original, making it backwards compatible. They extended the 32-bit registers to 64-bit, and eax became rax. Again it was possible to to perform operations on the low 32 bits, but doing so clears the high 32 bits of the register.

“Operations that output to a 32-bit subregister are automatically zero-extended to the entire 64-bit register. Operations that output to 8-bit or 16-bit subregisters are not zero-extended (this is compatible x86 behavior).”

Now both choices work as far as backwards compatibility goes, and as long as we as programmers are aware of what happens, neither is a problem.

When building the aPLib compression library, I use Visual C++ to generate assembly listings, which I then perform some changes on with a perl script, before assembling the object files. While working on the recently released 64-bit version, I ran into a problem — the debug build of the library worked fine, but the release build did not.

Bugs like this are often caused by some improper memory usage, so I spent a day trying to track down the problem without much luck. Somehow the contents of a register was corrupted.

Looking through the code in HIEW I finally found the cause; a seemingly random instruction that wrote to the 32-bit part of a register, thereby clearing the high 32 bits. Then it dawned on me.

Visual C++ emits padding macros into assembly listings to align code and improve performance. These macros, npad, are defined in a file called listing.inc which resides in the Visual C++ include folder. But there is no 64-bit version of this file!

Let’s have a look then:

1
2
3
4
5
6
7
8
9
10
11
;; LISTING.INC
 
;; non destructive nops
npad macro size
if size eq 1
  nop
else
 if size eq 2
   mov edi, edi
 else
   ...

And there we have it. An instruction like mov edi, edi is safe to use as padding in 32-bit code, because moving the register to itself has no effect. But if you insert it in 64-bit code, it all of a sudden has an effect — the high 32 bits of rdi are cleared.

I have reported the problem to Microsoft and they say it will be addressed in a future release.

pixelstats trackingpixel
Posted in Bugs | Tagged , , , | 2 Comments

Still Hard to C

The original Hard to C blog started life about 4 years ago on Blogger. It was meant as a place I could post about the experiences I was making while coding on my projects (mainly in C/C++).

Unfortunately it didn’t get updated due to a number of circumstances, although I have had plenty of moments along the past years where it could have been useful letting off steam about some compiler bug or C/C++ standard issue hitting me by surprise.

Recently I have been updating some code, which inevitably led to some new material, so I wanted to revive Hard to C.

I talked with Mouser and Gothi[c] at DonationCoder.com who graciously agreed to host the blog at this new location, I would like to take the opportunity to express my gratitude for their help.

I added the posts from the original blog here, and I hope some new ones will start to appear soon (when I’m done fiddling with WordPress that is hehe).

pixelstats trackingpixel
Posted in Uncategorized | 1 Comment

What Else Could Go Wrong?

JrDebugLogger is a very nice debug logging library. Much of it’s functionality is implemented through macros to allow it to be selectively left out when compiling. Along the way the author has had some interesting problems to solve, and this post is about one of them.

Assume we use the following macro:

#define DEBUGOUT if (debug_on) debug_stream

to allow us to perform debug logging with a stream-like interface. We can then do:

DEBUGOUT << "hello";

which expands to:

if (debug_on) debug_stream << "hello";

Now if the compiler knows that debug_on is false, it can leave out all code related to the debug logging, since it knows it will never be called. If it does not know the value at compile time, the resulting code will contain a very fast check around the call, allowing debug logging to be turned on and off dynamically with little performance overhead.

There is, however, an insidious bug lurking in the corner, waiting to jump at the user. Can you spot the problem? Think about it for a minute or two before reading on.

Consider this use:

1
2
3
4
if (i > limit)
    DEBUGOUT << "i too big";
else
    do_computation(i);

it expands to:

1
2
3
4
if (i > limit)
    if (debug_on) debug_stream << "i too big";
else
    do_computation(i);

This is valid C++, and compiled without warnings on the three compilers I tried. But who does that else belong to?

Let’s see what the standard says:

“An else is associated with the lexically nearest preceding if that is allowed by the syntax.”

This is from the C99 standard (6.8.4.1p3), which was the most clear, however statements to the same effect are present in the C++ standards.

So the above is equivalent to:

1
2
3
4
5
6
7
if (i > limit)
{
    if (debug_on)
        debug_stream << "i too big";
    else
        do_computation(i);
}

which was of course not the intention.

So how can we solve this without giving up the nice properties of the if? The simple solution is to give the if in the macro its own else:

#define DEBUGOUT if (!debug_on) ; else debug_stream

We now get the expansion:

1
2
3
4
if (i > limit)
    if (!debug_on) ; else debug_stream << "i too big";
else
    do_computation(i);

and the compiler will correctly associate the users else with his if. So it is equivalent to:

1
2
3
4
5
6
7
8
9
if (i > limit)
{
    if (!debug_on)
        ;
    else
        debug_stream << "i too big";
} else {
    do_computation(i);
}

Thanks to Jesse for the nice topic.

pixelstats trackingpixel
Posted in Programming | Tagged , | Leave a comment

Ternary Trickery

There is a good article about BOOST_FOREACH by Eric Niebler on The C++ Source.

It contains some interesting trickery involving the ternary conditional operator (?:).

pixelstats trackingpixel
Posted in Programming | Tagged , , , | Leave a comment

Loophole in Visual C++, Part 2

Here is a slightly more elaborate example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#include <stdio.h>
#include <limits.h>
 
unsigned int ratio(unsigned int x, unsigned int y)
{
    if (x <= UINT_MAX / 100) x *= 100; else y /= 100;
 
    if (y == 0) y = 1;
 
    return x / y;
}
 
int main(void)
{
    unsigned int count;
 
    for (count = 0x3fffffff; count != 0; ++count)
    {
        /* do something */
 
        /* show progress */
        printf("\r%u%% done", ratio(count, UINT_MAX));
    }
 
    return 0;
}

This program goes through the entire range of the unsigned int type, performing some action for each. It shows the progress by calling a function to compute the ratio of count to the maximum possible value. Again, count is incremented in each step, and hence will reach the value zero at some point.

The program works as expected on the compilers I tried, except for cl.exe from VC7 and VC71 with the /O2 switch, which stop at 25%. In case you wondered about the starting point of 0x3fffffff, that’s the reason — no need to watch your machine chew it’s way through all integers up to 25%.

Looking at the code generated for the loop:

1
2
3
4
5
$L873:
        ...
        inc    esi
        add    edi, 100                ; 00000064H
        jne    short $L873

We see that it fails because the two instructions before the conditional jump have been reversed. Again it looks like the optimizer fails to recognize the importance of the increment to the loop.

pixelstats trackingpixel
Posted in Bugs | Tagged , , | 4 Comments

Additional Trouble

2 plus 2 is 4, but does that generalize?

What is your immediate reaction to this little program?

1
2
3
4
5
6
7
8
#include <stdio.h>
 
int main(void)
{
    if (20000 + 20000 == 40000) printf("HardToC");
 
    return 0;
}

If it was something along the lines of ‘depends’ then you’re either a raider of the standard, or you’ve just been around C/C++ for too long like me.

The type of an unsuffixed decimal integer constant is the first type from a list in which its value can be represented:

C89 - int, long int, unsigned long int
C99 - int, long int, long long int
C++ - int, long int

Now, the problem with the little program above is that if the int type is 16-bit, then 20000 + 20000 results in an overflow because the maximum value of a 16-bit int is 32767. We are guaranteed that computations involving unsigned operands cannot overflow, but there is no such guarantee for signed operands. So the addition may leave us in the land of undefined behaviour.

I compiled the above example with three DOS 16-bit compilers; Borland, Open Watcom and Digital Mars. None of the programs gave any output when run. Borland warned about the overflow, Open Watcom warned at -w2, Digital Mars did not warn.

What happens is that in the x86 two’s complement representation, 20000 + 20000 overflows and becomes -25536, which is not equal to 40000.

Writing portable, standard compliant C/C++ is not always easy .. and it can be Hard to C the problems.

pixelstats trackingpixel
Posted in Programming | Tagged , , | Leave a comment

Loophole in Visual C++, Part 1

Lets start this post by recalling what the gosp^H^H^H^Hstandard has to say about unsigned arithmetic:

“A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.”

This is from the C89 draft (3.1.2.5p5), statements to the same effect are present in the C99 standard (6.2.5p9) and the C++ standards.

Now consider the following program:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <stdio.h>
 
int main(void)
{
    unsigned int count = 0;
 
    do {
        printf("%u\n", count);
        count += 1;
    } while (count != 0);
 
    return 0;
}

Since count starts at zero and is incremented each time through the loop, the standard tells us it will wrap to zero when it reaches a result that cannot be represented by an unsigned int, making the program terminate. Compiling the program with various compilers gives the expected stream of increasing numbers.

However, if you compile it with cl.exe from Visual C++ using the /O2 switch (maximize speed) you get a somewhat surprising result; a single zero and the program exits. This goes for VC6, VC7 and VC71.

If you initialize count to one instead, the program works fine. So it looks like the optimizer fails to recognize the addition as changing the value of count, and thus optimizes away the loop.

I have not tested the various VC8 betas, so if you have any of them installed, feel free to try it out and post your results (just remember to compile from the command-line using cl.exe and /O2).

pixelstats trackingpixel
Posted in Bugs | Tagged , , | Leave a comment

Herb Sutter on Visual C++

Channel 9 has a great interview with Herb Sutter (part 1, part 2).

I think he has some very sound arguments about programming and programmers, which are as interesting as the information about the future of Visual C++.

pixelstats trackingpixel
Posted in Programming | Tagged , , , | Leave a comment