August 19, 2015
On Wednesday, 19 August 2015 at 17:25:13 UTC, deadalnix wrote:
>
> Apple is invested in LLVM. For other thing you mention, WebAssembly is an AST representation, which is both dumb and do not look like anything like LLVM IR.

I saw more similarity between wasm and SPIR-V than LLVM, but it definitely seems to have some differences. I'm not sure what you mean when you say that using the AST representation is dumb. It probably wouldn't be what you would design initially, but I think part of the motivation of the design was to work within the context of the web's infrastructure.
August 19, 2015
On 8/19/2015 11:03 AM, Jacob Carlborg wrote:
> Not sure how the compilers behave in this case but what about devirtualization?
> Since I think most developers compile their D programs with all files at once
> there should be pretty good opportunities to do devirtualization.

It's true that if generating an exe, the compiler can mark leaf classes as final and get devirtualization. (Of course, you can manually add 'final' to classes.)

It's one way D can generate faster code than C++.

August 19, 2015
On Wednesday, 19 August 2015 at 18:41:07 UTC, Walter Bright wrote:
> On 8/19/2015 11:03 AM, Jacob Carlborg wrote:
>> Not sure how the compilers behave in this case but what about devirtualization?
>> Since I think most developers compile their D programs with all files at once
>> there should be pretty good opportunities to do devirtualization.
>
> It's true that if generating an exe, the compiler can mark leaf classes as final and get devirtualization. (Of course, you can manually add 'final' to classes.)
>
> It's one way D can generate faster code than C++.

C++ also has final and if I am not mistaken both LLVM and Visual C++ do devirtualization, not sure about other compilers.
August 19, 2015
On Wednesday, 19 August 2015 at 18:26:45 UTC, jmh530 wrote:
> On Wednesday, 19 August 2015 at 17:25:13 UTC, deadalnix wrote:
>>
>> Apple is invested in LLVM. For other thing you mention, WebAssembly is an AST representation, which is both dumb and do not look like anything like LLVM IR.
>
> I saw more similarity between wasm and SPIR-V than LLVM, but it definitely seems to have some differences. I'm not sure what you mean when you say that using the AST representation is dumb. It probably wouldn't be what you would design initially, but I think part of the motivation of the design was to work within the context of the web's infrastructure.

AST is a useful representation to extract something usable from source code and to perform some semantic analysis. It is NOT a good representation to do optimization and codegen. For these SSA and/or stack machines + CFG is much more practical.

Having wasm as an AST forces the process to go roughly as follow :
source code -> AST -> SSA-CFG -> optimized SSA-CFG -> AST -> wasm -> AST -> SSA-CFG -> optimized SSA-CFG -> machine code.

It add many steps in the process for no good reason. Well, in fact there is a good reason. pNaCl is SSA-CFG but mozilla spend a fair amount of time to explain us how bad and evil it is compared to the glorious asm.js they proposed. Going back to SSA would be an admission of defeat, which nobody likes to do, and webasm want mozilla to be onboard, so sidestepping the whole issue by going AST makes sense politically. It has no engineering merit.
August 19, 2015
On Wednesday, 19 August 2015 at 18:47:21 UTC, Paulo Pinto wrote:
> On Wednesday, 19 August 2015 at 18:41:07 UTC, Walter Bright wrote:
>> On 8/19/2015 11:03 AM, Jacob Carlborg wrote:
>>> Not sure how the compilers behave in this case but what about devirtualization?
>>> Since I think most developers compile their D programs with all files at once
>>> there should be pretty good opportunities to do devirtualization.
>>
>> It's true that if generating an exe, the compiler can mark leaf classes as final and get devirtualization. (Of course, you can manually add 'final' to classes.)
>>
>> It's one way D can generate faster code than C++.
>
> C++ also has final and if I am not mistaken both LLVM and Visual C++ do devirtualization, not sure about other compilers.

GCC is much better than LLVM at this. This is an active area of work in both compiler right now.

Note that combined with PGO, you can do some very nice speculative devirtualization.

D has a problem here: template are duck typed. That means the compiler can't know what possible instantiations may be done, especially when shared object are in the party. This is one area where stronger typing for metaprogramming is a win.
August 19, 2015
On Wednesday, 19 August 2015 at 17:30:13 UTC, Walter Bright wrote:
> On 8/19/2015 7:34 AM, anonymous wrote:
>> I have a about 30 lines of numerical code (using real) where the gap is about
>> 200%-300% between ldc/gdc and dmd (linux x86_64). In fact dmd -O etc is at the
>> level of ldc/gdc without any optimizations and dmd without -0 is even slower.
>> With double instead of real the gap is about 30%.
>
> If it's just 30 lines of code, you can put it on bugzilla.

The problem are not the 30 lines + white space but the input file used in my benchmark. The whole benchmark programm has 115 lines including empty lines and braces. The input file is 4.8 MB large.

Anyway the raw asm generated by the different compiler may be helpful to the expert:)

https://issues.dlang.org/show_bug.cgi?id=14937
August 19, 2015
On 19 August 2015 at 21:00, deadalnix via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> On Wednesday, 19 August 2015 at 18:47:21 UTC, Paulo Pinto wrote:
>
>> On Wednesday, 19 August 2015 at 18:41:07 UTC, Walter Bright wrote:
>>
>>> On 8/19/2015 11:03 AM, Jacob Carlborg wrote:
>>>
>>>> Not sure how the compilers behave in this case but what about
>>>> devirtualization?
>>>> Since I think most developers compile their D programs with all files
>>>> at once
>>>> there should be pretty good opportunities to do devirtualization.
>>>>
>>>
>>> It's true that if generating an exe, the compiler can mark leaf classes
>>> as final and get devirtualization. (Of course, you can manually add 'final'
>>> to classes.)
>>>
>>> It's one way D can generate faster code than C++.
>>>
>>
>> C++ also has final and if I am not mistaken both LLVM and Visual C++ do devirtualization, not sure about other compilers.
>>
>
> GCC is much better than LLVM at this. This is an active area of work in both compiler right now.
>
>
Can't speak for LLVM, but scope classes in GDC are *always* devirtualized because the compiler knows the vtable layout and using constant propagation to find the direct call.

You *could* do this with all classes in general, but this is a missed opportunity because the vtable is initialized in the library using memcpy, rather than by the compiler using a direct copy assignment.

https://issues.dlang.org/show_bug.cgi?id=14912

I not sure even LTO/PGO could see through the memcpy to devirtualize even the most basic calls.


August 19, 2015
On Wednesday, 19 August 2015 at 11:22:09 UTC, Dmitry Olshansky wrote:
> On 18-Aug-2015 15:37, Vladimir Panteleev wrote:
>> I think stability of the DMD backend is a goal of much higher value than
>> the performance of the code it emits. DMD is never going to match the
>> code generation quality of LLVM and GCC, which have had many, many
>> man-years invested in them. Working on DMD optimizations is essentially
>> duplicating this work, and IMHO I think it's not only a waste of time,
>> but harmful to D because of the risk of regressions.
>
> How about stress-testing with some simple fuzzer:
> 1. Generate a sequence of pluasable expressions/functions.
> 2. Spit out results via printf.
> 3. Permute -O -inline and compare the outputs.

Tools like csmith [0] are surprisingly good at finding ICEs, but useless for performance regressions.
A "dsmith" would probably find lots of bugs in the dmd backend.

[0] https://embed.cs.utah.edu/csmith/
August 19, 2015
On Wednesday, 19 August 2015 at 20:54:39 UTC, qznc wrote:
> On Wednesday, 19 August 2015 at 11:22:09 UTC, Dmitry Olshansky wrote:
>> On 18-Aug-2015 15:37, Vladimir Panteleev wrote:
>>> [...]
>>
>> How about stress-testing with some simple fuzzer:
>> 1. Generate a sequence of pluasable expressions/functions.
>> 2. Spit out results via printf.
>> 3. Permute -O -inline and compare the outputs.
>
> Tools like csmith [0] are surprisingly good at finding ICEs, but useless for performance regressions.
> A "dsmith" would probably find lots of bugs in the dmd backend.
>
> [0] https://embed.cs.utah.edu/csmith/

fwiw, llvm/clang uses their own in-library fuzzer.
http://blog.llvm.org/2015/04/fuzz-all-clangs.html
August 19, 2015
On 8/19/2015 12:39 PM, anonymous wrote:
> The problem are not the 30 lines + white space but the input file used in my
> benchmark. The whole benchmark programm has 115 lines including empty lines and
> braces. The input file is 4.8 MB large.
>
> Anyway the raw asm generated by the different compiler may be helpful to the
> expert:)
>
> https://issues.dlang.org/show_bug.cgi?id=14937

Thanks!
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18