Thread overview
[dmd-internals] building dmd with the microsoft compiler
Nov 13, 2011
Rainer Schuetze
Nov 13, 2011
Rainer Schuetze
Nov 13, 2011
Walter Bright
Nov 14, 2011
Rainer Schuetze
Nov 13, 2011
Walter Bright
Nov 13, 2011
Rainer Schuetze
Nov 13, 2011
Walter Bright
Nov 14, 2011
Rainer Schuetze
Nov 19, 2011
Rainer Schuetze
November 13, 2011
Hi,

a few days ago, I was trying to figure out why dmd takes so long compiling my projects. (The usual development build time for VisualD+parser is about 40 seconds which starts to get annoying. With enabled GC it was more than a minute. Comparing this with the C++ project at work which is about 50 times larger, but only takes about 10 seconds to (incrementally) build with changes to a few files, I think there is something wrong with the D compilation model. But this is a different topic.)

Activating the profiler didn't help (crashes or takes forever), so I pushed the dmd source through the Microsoft compiler (I'm currently using cl v15 that comes with VS2008). Using the Microsoft compiler has several advantages

- way better debug information and integration into current debuggers
- better support for instrumented/sampling profiling tools
- better dependency handling for building dmd (if used from within
Visual Studio)
- the executable can use up to 4GB memory on Win 64-bit host, up to 3GB
memory on Win32

The downsides are

- no 80-bit float support
- no C99 sprintf (%j, %z, %a)
- restricted structured exception handling

I have patched the compiler to use handcrafted implementations and it now passes the dmd test suite.

Quite a bit to my surprise, the compiler is also almost twice as fast on my projects as the dmc compiled dmd! The test suite runs a bit slower though (5% to 10%).

Is there interest in adding the patches to the dmd source? I would like to do some cleanup before creating a pull request.

The major changes are with respect to the 80-bit floats, because this also leaks into the backend. Mostly, it's replacing "long double" with my implementation "long_double". I've seen "d-gcc-real.h" being used if IN_GCC  is defined, but the defined type real_t is only referred to partially. I wonder whether this is still in use (by LDC/GDC?) and whether it ever passed the test suites?

The C99 printf functions are overloaded by inserting an include file on the command line, I would prefer if these would go into port.c/h, but these would need to be included by the backend aswell then.

I have also built a 64-bit executable of dmd building 32-bit code, but it does not perform aswell (somewhere between dmc and msc 32-bit versions).

Rainer

November 13, 2011
When compiling dmd with the Microsoft compiler, I gag thousands of warnings (at highest warning level 4) with:

#pragma warning(disable:4996) // This function or variable may be unsafe.
#pragma warning(disable:4127) // conditional expression is constant
#pragma warning(disable:4101) // unreferenced local variable
#pragma warning(disable:4100) // unreferenced formal parameter
#pragma warning(disable:4146) // unary minus operator applied to
unsigned type, result still unsigned
#pragma warning(disable:4244) // conversion from 'int' to 'unsigned
short', possible loss of data
#pragma warning(disable:4245) // conversion from 'int' to 'unsigned
int', signed/unsigned mismatch
#pragma warning(disable:4018) // signed/unsigned mismatch
#pragma warning(disable:4389) // signed/unsigned mismatch
#pragma warning(disable:4505) // unreferenced local function has been
removed
#pragma warning(disable:4701) // potentially uninitialized local
variable 'm' used
#pragma warning(disable:4201) // nonstandard extension used : nameless
struct/union
#pragma warning(disable:4189) // local variable is initialized but not
referenced
#pragma warning(disable:4102) // unreferenced label
#pragma warning(disable:4800) // forcing value to bool 'true' or 'false'
(performance warning)

Still, more than a hundred warnings remain, the more interesting ones (line numbers might be off a little because of my changes):

- I did not figure out what's wrong with this, but it sounds dangerous:
         RTLSYMS
.\backend\rtlsym.c(95) : warning C4806: '|' : unsafe operation: no value
of type 'bool' promoted to type 'int' can equal the given constant
.\backend\rtlsym.c(95) : warning C4554: '|' : check operator precedence
for possible error; use parentheses to clarify precedence

- probably false positives regarding calculations with bools like:
         i ^= d1 > d2;
.\backend\evalu8.c(1667) : warning C4805: '^=' : unsafe mix of type
'int' and type 'bool' in operation
     return (x - sign) ^ -sign;
.\intrange.c(24) : warning C4804: '-' : unsafe use of type 'bool' in
operation

- lots of "truncation of constant value", which are more unsigned ->
signed conversion
     static char nops[7] = { 0x90,0x90,0x90,0x90,0x90,0x90,0x90 };
.\backend\cod3.c(324) : warning C4309: 'initializing' : truncation of
constant value

- but this seems problematic, because the array value is later used as
an index assumed positive
         static char invconvtab[] = { .... OPd_ld, ... };
.\backend\cgelem.c(377) : warning C4305: 'initializing' : truncation
from 'OPER' to 'char'
.\backend\cgelem.c(377) : warning C4309: 'initializing' : truncation of
constant value

- too large shift count:
         value |= value << 32;
.\backend\cod2.c(3312) : warning C4293: '<<' : shift count negative or
too big, undefined behavior

- the following looks actually like a bug:
     if (!config.flags4 & CFG4optimized)
.\backend\cgreg.c(53) : warning C4806: '&' : unsafe operation: no value
of type 'bool' promoted to type 'int' can equal the given constant

- this is probably another bug:
             if (sc->func && !((TypeFunction *)t1)->trust <= TRUSTsystem)
.\expression.c(7680) : warning C4804: '<=' : unsafe use of type 'bool'
in operation

- switch on enumerators with values not part of the enumeration: .\iasm.c(4053) : warning C4063: case '237' is not a valid value for switch of enum 'TOK'

- lots of warnings regarding unreachable code, mostly due to preprocessor conditionals

Rainer


On 13.11.2011 17:13, Rainer Schuetze wrote:
> Hi,
>
> a few days ago, I was trying to figure out why dmd takes so long compiling my projects. (The usual development build time for VisualD+parser is about 40 seconds which starts to get annoying. With enabled GC it was more than a minute. Comparing this with the C++ project at work which is about 50 times larger, but only takes about 10 seconds to (incrementally) build with changes to a few files, I think there is something wrong with the D compilation model. But this is a different topic.)
>
> Activating the profiler didn't help (crashes or takes forever), so I pushed the dmd source through the Microsoft compiler (I'm currently using cl v15 that comes with VS2008). Using the Microsoft compiler has several advantages
>
> - way better debug information and integration into current debuggers
> - better support for instrumented/sampling profiling tools
> - better dependency handling for building dmd (if used from within
> Visual Studio)
> - the executable can use up to 4GB memory on Win 64-bit host, up to
> 3GB memory on Win32
>
> The downsides are
>
> - no 80-bit float support
> - no C99 sprintf (%j, %z, %a)
> - restricted structured exception handling
>
> I have patched the compiler to use handcrafted implementations and it now passes the dmd test suite.
>
> Quite a bit to my surprise, the compiler is also almost twice as fast on my projects as the dmc compiled dmd! The test suite runs a bit slower though (5% to 10%).
>
> Is there interest in adding the patches to the dmd source? I would like to do some cleanup before creating a pull request.
>
> The major changes are with respect to the 80-bit floats, because this also leaks into the backend. Mostly, it's replacing "long double" with my implementation "long_double". I've seen "d-gcc-real.h" being used if IN_GCC  is defined, but the defined type real_t is only referred to partially. I wonder whether this is still in use (by LDC/GDC?) and whether it ever passed the test suites?
>
> The C99 printf functions are overloaded by inserting an include file on the command line, I would prefer if these would go into port.c/h, but these would need to be included by the backend aswell then.
>
> I have also built a 64-bit executable of dmd building 32-bit code, but it does not perform aswell (somewhere between dmc and msc 32-bit versions).
>
> Rainer
>
>

November 13, 2011

On 11/13/2011 8:13 AM, Rainer Schuetze wrote:
>
>
> I have patched the compiler to use handcrafted implementations and it now passes the dmd test suite.
>
> Quite a bit to my surprise, the compiler is also almost twice as fast on my projects as the dmc compiled dmd! The test suite runs a bit slower though (5% to 10%).

Hmm, very curious. I wonder what the speed changes are due to.

>
> Is there interest in adding the patches to the dmd source? I would like to do some cleanup before creating a pull request.

Sure.

>
> The major changes are with respect to the 80-bit floats, because this also leaks into the backend. Mostly, it's replacing "long double" with my implementation "long_double". I've seen "d-gcc-real.h" being used if IN_GCC is defined, but the defined type real_t is only referred to partially. I wonder whether this is still in use (by LDC/GDC?) and whether it ever passed the test suites?

I don't know the status of IN_GCC. I wouldn't use it for the VC build.

>
> The C99 printf functions are overloaded by inserting an include file on the command line, I would prefer if these would go into port.c/h, but these would need to be included by the backend aswell then.

They should go in port, or perhaps we should just dispense with using the C99 formats.

November 13, 2011
On 13.11.2011 19:33, Walter Bright wrote:
>
>
> On 11/13/2011 8:13 AM, Rainer Schuetze wrote:
>>
>>
>> I have patched the compiler to use handcrafted implementations and it now passes the dmd test suite.
>>
>> Quite a bit to my surprise, the compiler is also almost twice as fast on my projects as the dmc compiled dmd! The test suite runs a bit slower though (5% to 10%).
>
> Hmm, very curious. I wonder what the speed changes are due to.
>
I just noticed today that the backend files are compiled without optimizations in win32.mak. Is this on purpose? I didn't see noticeable improvements in compilation speed when enabling "-o", though.

>>
>> Is there interest in adding the patches to the dmd source? I would like to do some cleanup before creating a pull request.
>
> Sure.
>
>>
>> The major changes are with respect to the 80-bit floats, because this also leaks into the backend. Mostly, it's replacing "long double" with my implementation "long_double". I've seen "d-gcc-real.h" being used if IN_GCC  is defined, but the defined type real_t is only referred to partially. I wonder whether this is still in use (by LDC/GDC?) and whether it ever passed the test suites?
>
> I don't know the status of IN_GCC. I wouldn't use it for the VC build.

I didn't use it. But if it is still in use, among other things it enables another implementation of 80-bit floats which might not be compatible with my changes, so it might annoy LDC/GDC developers quite a bit.

>
>>
>> The C99 printf functions are overloaded by inserting an include file on the command line, I would prefer if these would go into port.c/h, but these would need to be included by the backend aswell then.
>
> They should go in port, or perhaps we should just dispense with using the C99 formats.

I will try to put it in port. %a on 80-bit floats needs to be implemented anyway.

>
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
>

November 13, 2011

On 11/13/2011 8:42 AM, Rainer Schuetze wrote:
>
>
>
> - I did not figure out what's wrong with this, but it sounds dangerous:
>         RTLSYMS
> .\backend\rtlsym.c(95) : warning C4806: '|' : unsafe operation: no value of
> type 'bool' promoted to type 'int' can equal the given constant
> .\backend\rtlsym.c(95) : warning C4554: '|' : check operator precedence for
> possible error; use parentheses to clarify precedence

bug

>
> - probably false positives regarding calculations with bools like:
>         i ^= d1 > d2;
> .\backend\evalu8.c(1667) : warning C4805: '^=' : unsafe mix of type 'int' and
> type 'bool' in operation

spurious, but fixed anyway

>     return (x - sign) ^ -sign;
> .\intrange.c(24) : warning C4804: '-' : unsafe use of type 'bool' in operation

spurious, but fixed anyway

>
> - lots of "truncation of constant value", which are more unsigned -> signed
> conversion
>     static char nops[7] = { 0x90,0x90,0x90,0x90,0x90,0x90,0x90 };
> .\backend\cod3.c(324) : warning C4309: 'initializing' : truncation of constant
> value
>

latent bug

> - but this seems problematic, because the array value is later used as an
> index assumed positive
>         static char invconvtab[] = { .... OPd_ld, ... };
> .\backend\cgelem.c(377) : warning C4305: 'initializing' : truncation from
> 'OPER' to 'char'
> .\backend\cgelem.c(377) : warning C4309: 'initializing' : truncation of
> constant value

latent bug

>
> - too large shift count:
>         value |= value << 32;
> .\backend\cod2.c(3312) : warning C4293: '<<' : shift count negative or too
> big, undefined behavior

it works, but fixed anyway

>
> - the following looks actually like a bug:
>     if (!config.flags4 & CFG4optimized)
> .\backend\cgreg.c(53) : warning C4806: '&' : unsafe operation: no value of
> type 'bool' promoted to type 'int' can equal the given constant

bug

>
> - this is probably another bug:
>             if (sc->func && !((TypeFunction *)t1)->trust <= TRUSTsystem)
> .\expression.c(7680) : warning C4804: '<=' : unsafe use of type 'bool' in
> operation

bug

>
> - switch on enumerators with values not part of the enumeration: .\iasm.c(4053) : warning C4063: case '237' is not a valid value for switch of enum 'TOK'

spurious, but fixed anyway

>
> - lots of warnings regarding unreachable code, mostly due to preprocessor conditionals
>
November 13, 2011

On 11/13/2011 10:55 AM, Rainer Schuetze wrote:
> On 13.11.2011 19:33, Walter Bright wrote:
>>
>>
>> On 11/13/2011 8:13 AM, Rainer Schuetze wrote:
>>>
>>>
>>> I have patched the compiler to use handcrafted implementations and it now passes the dmd test suite.
>>>
>>> Quite a bit to my surprise, the compiler is also almost twice as fast on my projects as the dmc compiled dmd! The test suite runs a bit slower though (5% to 10%).
>>
>> Hmm, very curious. I wonder what the speed changes are due to.
>>
> I just noticed today that the backend files are compiled without optimizations in win32.mak. Is this on purpose? I didn't see noticeable improvements in compilation speed when enabling "-o", though.

I just never got around to fixing that.

>
>>>
>>> Is there interest in adding the patches to the dmd source? I would like to do some cleanup before creating a pull request.
>>
>> Sure.
>>
>>>
>>> The major changes are with respect to the 80-bit floats, because this also leaks into the backend. Mostly, it's replacing "long double" with my implementation "long_double". I've seen "d-gcc-real.h" being used if IN_GCC is defined, but the defined type real_t is only referred to partially. I wonder whether this is still in use (by LDC/GDC?) and whether it ever passed the test suites?
>>
>> I don't know the status of IN_GCC. I wouldn't use it for the VC build.
>
> I didn't use it. But if it is still in use, among other things it enables another implementation of 80-bit floats which might not be compatible with my changes, so it might annoy LDC/GDC developers quite a bit.
>
>>
>>>
>>> The C99 printf functions are overloaded by inserting an include file on the command line, I would prefer if these would go into port.c/h, but these would need to be included by the backend aswell then.
>>
>> They should go in port, or perhaps we should just dispense with using the C99 formats.
>
> I will try to put it in port. %a on 80-bit floats needs to be implemented anyway.
>

We need %a, but the other C99 formats can be eliminated by casting to (unsigned long long) and using a ull format.
November 14, 2011
Thanks.

My list was only a compilation of examples of errors, I was not expecting that they would be handled so fast (I hoped so for actual bugs). So there are some more:

backend\dt.c(213) : warning C4293: '>>' : shift count negative or too
big, undefined behavior
backend\cod3.c(2025) : warning C4310: cast truncates constant value
backend\cod3.c(2734) : warning C4806: '^' : unsafe operation: no value
of type 'bool' promoted to type 'int' can equal the given constant
backend\cod3.c(5171) : warning C4309: '=' : truncation of constant value
backend\cod2.c(1507) : warning C4806: '^' : unsafe operation: no value
of type 'bool' promoted to type 'int' can equal the given constant
backend\cod2.c(3338) : warning C4293: '<<' : shift count negative or too
big, undefined behavior
backend\cod1.c(2218) : warning C4309: 'initializing' : truncation of
constant value
backend\cgxmm.c(405) : warning C4305: '=' : truncation from '__int64' to
'targ_size_t'
backend\cgobj.c(510) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(1094) : warning C4309: 'initializing' : truncation of
constant value
backend\cgobj.c(1217) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(1235) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(1317) : warning C4309: 'initializing' : truncation of
constant value
backend\cgobj.c(1319) : warning C4309: 'initializing' : truncation of
constant value
backend\cgobj.c(1320) : warning C4309: 'initializing' : truncation of
constant value
backend\cgobj.c(1326) : warning C4309: 'initializing' : truncation of
constant value
backend\cgobj.c(1490) : warning C4309: 'initializing' : truncation of
constant value
backend\cgobj.c(2289) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(2318) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(2449) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(2455) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(2460) : warning C4309: '=' : truncation of constant value
backend\cgobj.c(2722) : warning C4309: '=' : truncation of constant value
backend\cg87.c(637) : warning C4305: 'initializing' : truncation from
'double' to 'float'
backend\cg87.c(660) : warning C4309: 'initializing' : truncation of
constant value
lexer.c(2356) : warning C4063: case '0' is not a valid value for switch
of enum 'Lexer::number::FLAGS'
         lexer.c(1973) : see declaration of 'Lexer::number::FLAGS'
intrange.c(62) : warning C4804: '-' : unsafe use of type 'bool' in operation
intrange.c(240) : warning C4804: '-' : unsafe use of type 'bool' in
operation
intrange.c(242) : warning C4804: '-' : unsafe use of type 'bool' in
operation
iasm.c(4277) : warning C4063: case '229' is not a valid value for switch
of enum 'TOK'
         lexer.h(42) : see declaration of 'TOK'
iasm.c(4566) : warning C4063: case '233' is not a valid value for switch
of enum 'TOK'
         lexer.h(42) : see declaration of 'TOK'

On 13.11.2011 23:49, Walter Bright wrote:
>
>
> On 11/13/2011 8:42 AM, Rainer Schuetze wrote:
>>
>>
>>
>> - I did not figure out what's wrong with this, but it sounds dangerous:
>>         RTLSYMS
>> .\backend\rtlsym.c(95) : warning C4806: '|' : unsafe operation: no
>> value of type 'bool' promoted to type 'int' can equal the given constant
>> .\backend\rtlsym.c(95) : warning C4554: '|' : check operator
>> precedence for possible error; use parentheses to clarify precedence
>
> bug
>
>>
>> - probably false positives regarding calculations with bools like:
>>         i ^= d1 > d2;
>> .\backend\evalu8.c(1667) : warning C4805: '^=' : unsafe mix of type
>> 'int' and type 'bool' in operation
>
> spurious, but fixed anyway
>
>>     return (x - sign) ^ -sign;
>> .\intrange.c(24) : warning C4804: '-' : unsafe use of type 'bool' in
>> operation
>
> spurious, but fixed anyway
>
>>
>> - lots of "truncation of constant value", which are more unsigned ->
>> signed conversion
>>     static char nops[7] = { 0x90,0x90,0x90,0x90,0x90,0x90,0x90 };
>> .\backend\cod3.c(324) : warning C4309: 'initializing' : truncation of
>> constant value
>>
>
> latent bug
>
>> - but this seems problematic, because the array value is later used
>> as an index assumed positive
>>         static char invconvtab[] = { .... OPd_ld, ... };
>> .\backend\cgelem.c(377) : warning C4305: 'initializing' : truncation
>> from 'OPER' to 'char'
>> .\backend\cgelem.c(377) : warning C4309: 'initializing' : truncation
>> of constant value
>
> latent bug
>
>>
>> - too large shift count:
>>         value |= value << 32;
>> .\backend\cod2.c(3312) : warning C4293: '<<' : shift count negative
>> or too big, undefined behavior
>
> it works, but fixed anyway
>
>>
>> - the following looks actually like a bug:
>>     if (!config.flags4 & CFG4optimized)
>> .\backend\cgreg.c(53) : warning C4806: '&' : unsafe operation: no
>> value of type 'bool' promoted to type 'int' can equal the given constant
>
> bug
>
>>
>> - this is probably another bug:
>>             if (sc->func && !((TypeFunction *)t1)->trust <= TRUSTsystem)
>> .\expression.c(7680) : warning C4804: '<=' : unsafe use of type
>> 'bool' in operation
>
> bug
>
>>
>> - switch on enumerators with values not part of the enumeration: .\iasm.c(4053) : warning C4063: case '237' is not a valid value for switch of enum 'TOK'
>
> spurious, but fixed anyway
>
>>
>> - lots of warnings regarding unreachable code, mostly due to preprocessor conditionals
>>
> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
>

November 14, 2011
On 13.11.2011 23:51, Walter Bright wrote:
>
>
> On 11/13/2011 10:55 AM, Rainer Schuetze wrote:
>> On 13.11.2011 19:33, Walter Bright wrote:
>>>
>>>
>>> On 11/13/2011 8:13 AM, Rainer Schuetze wrote:
>>>>
>>>>>
>>>>>>
>>>>>> The C99 printf functions are overloaded by inserting an include file on the command line, I would prefer if these would go into port.c/h, but these would need to be included by the backend aswell then.
>>>>>
>>>>> They should go in port, or perhaps we should just dispense with using the C99 formats.
>>>>
>>>> I will try to put it in port. %a on 80-bit floats needs to be implemented anyway.
>>>>
>
> We need %a, but the other C99 formats can be eliminated by casting to (unsigned long long) and using a ull format.

Ok, so it might be easiest to just have a function in port.c to write a long double into a buffer.

> _______________________________________________
> dmd-internals mailing list
> dmd-internals at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
>

November 14, 2011
I wonder if there is value in getting DMD to compiler cleanly (no warning) and pass the test suit on as many compiler as possible. I'd guess the most value would be in GCC, MS, LLVM and maybe one the EDG based compilers.
November 19, 2011
On 13.11.2011 19:33, Walter Bright wrote:
>
>
> On 11/13/2011 8:13 AM, Rainer Schuetze wrote:
>> Is there interest in adding the patches to the dmd source? I would like to do some cleanup before creating a pull request.
>
> Sure.

It's finally done (after going through the horror of "git rebase" deleting all changes and the complete history of the branch):

https://github.com/D-Programming-Language/dmd/pull/516