Raymond Chen recently blogged about the way CommandLineToArgvW treats quotes and backslashes. Parsing the command-line into argv is something I have had to fight with as well, so besides pointing to Raymond’s excellent post, I wanted to add a few comments of my own here.
We are examining how command-line arguments with spaces and quotes are handled. Part of the problem comes from the fact that DOS/Windows uses backslash as separator in paths. On systems like Unix, where forward slash is used instead, using backslash to escape special characters is less of a problem. But if you ever put a Windows path in a C string literal, you may have run into LTS — the situation where a string becomes unreadable due to escape characters.
Microsoft fixed this in C# with verbatim string literals. C# also implements a simpler method of escaping a quote inside a quoted string — doubling it — which is used in languages like Pascal and BASIC, and is what Raymond’s second hypothetical set of rules suggest.
The compromise we get for parsing command-line arguments in the C runtime library (and CommandLineToArgvW) is documented on MSDN. What the MSDN documentation does not tell you is that there is a second mechanism for inserting a literal quote in a quoted string — or at least there might be, depending on which version of the C runtime library.
In part one I talked about the support functions in the C standard libraries of various x86 32-bit compilers that perform arithmetic operations when you use 64-bit integers in your code.
While updating WCRT to work with the latest Visual C++ compilers, I was writing my own implementations of these functions, and naturally I tested them against the versions supplied in the VC CRT to verify they worked.
To my surprise, I found the GCD test I wrote for Long Division ran faster when compiled with WCRT.
32-bit GCD: 314
64-bit GCD: 851
32-bit GCD: 300
64-bit GCD: 835
32-bit GCD: 300
64-bit GCD: 656
This naturally piqued my curiosity.
Integer types with at least 64 bits have been a part of the C standard for a while now (they were added in C99, and were a standard extension in many 32-bit compilers before that). But have you ever wondered what exactly happens when you use them?
Consider the following function (substitute long long with __int64 if you are using an older version of Visual C++):
long long div64(long long x, long long y)
return x / y;
let’s first have a look at what the VC 64-bit compiler gives us:
mov r8, rdx
mov rax, rcx
Pretty much what you would expect, a little setup and an idiv instruction to perform the division. Now let’s try the VC 32-bit compiler:
_x$ = 8
_y$ = 16
mov eax, _y$[esp]
mov ecx, _y$[esp-4]
mov edx, _x$[esp]
mov eax, _x$[esp]
A little setup and .. a call?