Category Archives: Programming

Time Difference

This post is based on a discussion about Progress Bars of Life, where I was foolish enough to claim that printing a text string representing the difference between two times could not be that hard in C. It is not hard, but turned out not to be entirely trivial either.

Pocket Watch

The problem we will consider is; given two tm structs, compute the difference in time between them, in such a way that we can easily format a string that gives a textual representation of it. We want years, months, days, hours, minutes, seconds.

The first idea you might get is to use difftime() to get the difference in seconds between the two times, and then compute the quantities we want by simple arithmetic. So,

You often see something like this in timing code — it works great for showing elapsed time in seconds, minutes, even hours. Do you see any problems with this approach?
Continue reading

Long Division, Part 3

I ended part two of this series with an open question:

And of course I can’t help but wonder if the CLR is compiled with Visual C++, so doing arithmetic on 64-bit numbers in C# and other .NET languages ends up at the same runtime functions?

I don’t have a lot of experience in debugging the CLR myself, so I asked Brian Rasmussen if he might be interested in taking a look at it. He was kind enough to take the time to point me in the right direction.

A little digging showed that the CLR does in fact call some of these functions from the C runtime, but with a twist.
Continue reading

Move to Parent

This post is inspired by a discussion regarding a coding snack done by lanux128 on DonationCoder. A user requested a tool to add a context menu entry that would copy selected files and folders to the parent folder.

Moving folders is one of those tasks that appear trivial. But as always, the devil is in the details.

To keep it simple, we will assume regular files and folders in a filesystem where paths are unique. The problem we will look at is: Given absolute paths Src and Dst (where Dst is not equal to, or a subfolder of Src), move the contents of Src to Dst.

The first solution that comes to mind is to move files and folders recursively. In pseudocode it would be something along the lines of:

This works — well, almost.
Continue reading

Long Division, Part 2

In part one I talked about the support functions in the C standard libraries of various x86 32-bit compilers that perform arithmetic operations when you use 64-bit integers in your code.

While updating WCRT to work with the latest Visual C++ compilers, I was writing my own implementations of these functions, and naturally I tested them against the versions supplied in the VC CRT to verify they worked.

To my surprise, I found the GCD test I wrote for Long Division ran faster when compiled with WCRT.

This naturally piqued my curiosity.
Continue reading

Long Division

Integer types with at least 64 bits have been a part of the C standard for a while now (they were added in C99, and were a standard extension in many 32-bit compilers before that). But have you ever wondered what exactly happens when you use them?

Consider the following function (substitute long long with __int64 if you are using an older version of Visual C++):

let’s first have a look at what the VC 64-bit compiler gives us:

Pretty much what you would expect, a little setup and an idiv instruction to perform the division. Now let’s try the VC 32-bit compiler:

A little setup and .. a call?
Continue reading

INT_MIN

While adding a few header files to WCRT (a small C runtime library for Visual C++), I stumbled upon something that caught my interest.

INT_MIN in <limits.h> is a macro that expands to the minimum value for an object of type int. In the 32-bit C compilers I have installed at the moment, it is defined as:

So what exactly is wrong with the integer constant -2147483648 ?

Well, firstly it is not an integer constant. Let’s see what the standard says:

“An integer constant begins with a digit, but has no period or exponent part. It may have a prefix that specifies its base and a suffix that specifies its type.”

You will notice there is no mention of a sign. So -2147483648 is in fact a constant expression, consisting of the unary minus operator, and the integer constant 2147483648.

This still does not explain why that expression is not used directly in the macro. To see that, we have to revisit the rules for the type of integer constants.

The type of an unsuffixed integer constant is the first of these in which its value can be represented:

C89 : int, long int, unsigned long int
C99 : int, long int, long long int
C++ : int, long int, long long int

The problem is that 2147483648 cannot be represented in a signed 32-bit integer, so it becomes either an unsigned long int or a long long int.

So we have to resort to a little trickery, and compute -2147483648 as (-2147483647 – 1), which all fit nicely into 32-bit signed integers, and INT_MIN gets the right type and value.

If you happen to look up INT_MIN in the standard you will see:

minimum value for an object of type int

Which brings up the question why isn’t it (-32767 – 1)?

Pretty much any computer available today uses two’s complement to represent signed numbers, but this hasn’t always been the case.

Since C was designed to work efficiently on a variety of architectures, the standard’s limits allow for using other representations as well.

I will end this post with a little (not quite standard conformant) example. Try compiling it with your favorite C compiler, and let us know if something puzzles you.

System Up Time

A while ago I set out to write a little tool that would show the time a system had been running since the last reboot. It seemed like something that should be fairly easy to do, but as it turns out, it isn’t entirely straightforward.

The first thing that came to my mind was the GetTickCount function. It returns the number of milliseconds since the system was started, which fits nicely. There is one problem of course, the value returned is a DWORD, and that limits the time it can handle to about 50 days.

Microsoft realized this as well, and added GetTickCount64, which returns a 64-bit value instead. Unfortunately it only works on Vista+.

Looking more closely at the documentation for GetTickCount we see:

“To obtain the time elapsed since the computer was started, retrieve the System Up Time counter in the performance data in the registry key HKEY_PERFORMANCE_DATA. The value returned is an 8-byte value. For more information, see Performance Counters.”

The performance counters are a nice idea. Basically they provide a homogeneous interface to a multitude of counters that give information about how well the operating system or an application, service, or driver is performing.

The recommended way to access the data is through the PDH interface. You access the counters by specifying a counter path; a string that describes the computer, object, counter, and instance you are interested in.

Looking through the list of counters by objects for the ‘System’ object, we find the ‘System Up Time’ counter, which is exactly what we need.

So I wrote a test application to open a query, add the counter “\System\System Up Time”, collect the data, and display it. It failed. Apparently my computer did not have a ‘System Up Time’ counter.

Reading over the documentation for the PDH interface again did not help, but after some searching I ended up at this help article:

“Performance Data Helper (PDH) APIs use object and counter names that are in the localized language. Therefore, applications that use PDH APIs should always use the localized string for the object or counter name specification.”

Since I was running a Danish version of Windows, that explained the problem!

Following the steps outlined there, I found that the ‘System’ object has index 2, and the ‘System Up Time’ counter has index 674. With these indices in hand, you can then call the PdhLookupPerfNameByIndex function to get the localized names. Using the localized path “\System\Computerperiode uden afbrydelser” gave the desired result.

The choice to make the paths use localized names makes it somewhat more involved to use these functions. Also, this should have been described much more clearly in the PDH documentation, since it is quite possible for a developer using English Windows to read over the documentation like I did, and use a hardcoded path for a counter. This will work nicely while testing, and then fail if someone with a localized Windows uses it.

As an example, let’s take a look at the PsInfo tool. It is written by Mark Russinovich, one of the people behind Sysinternals.com, a site that specializes in advanced system utilities and technical information. He is also a coauthor of the Windows Internals book, describing the inner workings of Windows operation systems.

Running PsInfo on my system I get:

Could it be? let’s have a little peek inside PsInfo.exe:

Indeed, a hardcoded path to the counter using the English names.

Microsoft must have realized it could be a problem as well, because I found the function PdhAddEnglishCounter, added in Vista, which made me smile.

What Else Could Go Wrong?

JrDebugLogger is a very nice debug logging library. Much of it’s functionality is implemented through macros to allow it to be selectively left out when compiling. Along the way the author has had some interesting problems to solve, and this post is about one of them.

Assume we use the following macro:

to allow us to perform debug logging with a stream-like interface. We can then do:

which expands to:

Now if the compiler knows that debug_on is false, it can leave out all code related to the debug logging, since it knows it will never be called. If it does not know the value at compile time, the resulting code will contain a very fast check around the call, allowing debug logging to be turned on and off dynamically with little performance overhead.

There is, however, an insidious bug lurking in the corner, waiting to jump at the user. Can you spot the problem? Think about it for a minute or two before reading on.

Consider this use:

it expands to:

This is valid C++, and compiled without warnings on the three compilers I tried. But who does that else belong to?

Let’s see what the standard says:

“An else is associated with the lexically nearest preceding if that is allowed by the syntax.”

This is from the C99 standard (6.8.4.1p3), which was the most clear, however statements to the same effect are present in the C++ standards.

So the above is equivalent to:

which was of course not the intention.

So how can we solve this without giving up the nice properties of the if? The simple solution is to give the if in the macro its own else:

We now get the expansion:

and the compiler will correctly associate the users else with his if. So it is equivalent to:

Thanks to Jesse for the nice topic.

Additional Trouble

2 plus 2 is 4, but does that generalize?

What is your immediate reaction to this little program?

If it was something along the lines of ‘depends’ then you’re either a raider of the standard, or you’ve just been around C/C++ for too long like me.

The type of an unsuffixed decimal integer constant is the first type from a list in which its value can be represented:

C89 - int, long int, unsigned long int
C99 - int, long int, long long int
C++ - int, long int

Now, the problem with the little program above is that if the int type is 16-bit, then 20000 + 20000 results in an overflow because the maximum value of a 16-bit int is 32767. We are guaranteed that computations involving unsigned operands cannot overflow, but there is no such guarantee for signed operands. So the addition may leave us in the land of undefined behaviour.

I compiled the above example with three DOS 16-bit compilers; Borland, Open Watcom and Digital Mars. None of the programs gave any output when run. Borland warned about the overflow, Open Watcom warned at -w2, Digital Mars did not warn.

What happens is that in the x86 two’s complement representation, 20000 + 20000 overflows and becomes -25536, which is not equal to 40000.

Writing portable, standard compliant C/C++ is not always easy .. and it can be Hard to C the problems.