June 06, 2022

On Saturday, 4 June 2022 at 21:17:56 UTC, SealabJaster wrote:

>

On Saturday, 4 June 2022 at 19:54:48 UTC, Dukc wrote:

>
auto number = 200;
auto myVar = type == "as-is"?
  number:
  { number *= 2;
    number += 2;
    number /= 2;
    number = number > 300 ? 200 : 100;
    return number;
  }();

This next question comes from a place of ignorance: What is the codegen like for this code? Would is allocate a closure on the GC before performing the execution, or are the compilers smart enough to inline the entire thing?

LDC is able to inline the lambda and then optimize away the allocation. DMD is not.

https://godbolt.org/z/5vMcz4s14

June 06, 2022
On Mon, Jun 06, 2022 at 05:07:30PM +0000, deadalnix via Digitalmars-d wrote:
> On Saturday, 4 June 2022 at 21:17:56 UTC, SealabJaster wrote:
> > On Saturday, 4 June 2022 at 19:54:48 UTC, Dukc wrote:
> > > ```D
> > > auto number = 200;
> > > auto myVar = type == "as-is"?
> > >   number:
> > >   { number *= 2;
> > >     number += 2;
> > >     number /= 2;
> > >     number = number > 300 ? 200 : 100;
> > >     return number;
> > >   }();
> > > ```
> > 
> > This next question comes from a place of ignorance: What is the codegen like for this code? Would is allocate a closure on the GC before performing the execution, or are the compilers smart enough to inline the entire thing?
> 
> LDC is able to inline the lambda and then optimize away the allocation. DMD is not.
> 
> https://godbolt.org/z/5vMcz4s14

Yeah, when in doubt, trust LDC to do the "right thing". :-P  Well, that, and take a look at the generated assembly to see what it actually does. For questions of performance or codegen quality, I usually don't even bother looking at DMD output.


T

-- 
Unix was not designed to stop people from doing stupid things, because that would also stop them from doing clever things. -- Doug Gwyn
June 06, 2022
On Monday, 6 June 2022 at 17:43:47 UTC, H. S. Teoh wrote:
> On Mon, Jun 06, 2022 at 05:07:30PM +0000, deadalnix via Digitalmars-d wrote:
>> [...]
>>
>> LDC is able to inline the lambda and then optimize away the allocation. DMD is not.
>> 
>> https://godbolt.org/z/5vMcz4s14
>
> Yeah, when in doubt, trust LDC to do the "right thing". :-P  Well, that, and take a look at the generated assembly to see what it actually does. For questions of performance or codegen quality, I usually don't even bother looking at DMD output.
>
>
> T

I mean:

```
int example.foo(immutable(char)[]):
        cmp     rdi, 5
        jne     .LBB0_3
        mov     eax, 1764586337
        xor     eax, dword ptr [rsi]
        movzx   ecx, byte ptr [rsi + 4]
        xor     ecx, 115
        or      ecx, eax
        je      .LBB0_2
.LBB0_3:
        mov     eax, 100
        ret
.LBB0_2:
        mov     eax, 200
        ret
```

It's amazing.
June 06, 2022

On Monday, 6 June 2022 at 17:07:30 UTC, deadalnix wrote:

>

On Saturday, 4 June 2022 at 21:17:56 UTC, SealabJaster wrote:

>

On Saturday, 4 June 2022 at 19:54:48 UTC, Dukc wrote:

>
auto number = 200;
auto myVar = type == "as-is"?
  number:
  { number *= 2;
    number += 2;
    number /= 2;
    number = number > 300 ? 200 : 100;
    return number;
  }();

This next question comes from a place of ignorance: What is the codegen like for this code? Would is allocate a closure on the GC before performing the execution, or are the compilers smart enough to inline the entire thing?

LDC is able to inline the lambda and then optimize away the allocation. DMD is not.

https://godbolt.org/z/5vMcz4s14

Ah nice, then yeah I don't see any real downsides to just using lambdas like that :)

June 06, 2022
On Monday, 6 June 2022 at 20:31:06 UTC, deadalnix wrote:
> On Monday, 6 June 2022 at 17:43:47 UTC, H. S. Teoh wrote:
>> On Mon, Jun 06, 2022 at 05:07:30PM +0000, deadalnix via Digitalmars-d wrote:
>>> [...]
>>>
>>> LDC is able to inline the lambda and then optimize away the allocation. DMD is not.
>>> 
>>> https://godbolt.org/z/5vMcz4s14
>>
>> Yeah, when in doubt, trust LDC to do the "right thing". :-P  Well, that, and take a look at the generated assembly to see what it actually does. For questions of performance or codegen quality, I usually don't even bother looking at DMD output.
>>
>>
>> T
>
> I mean:
>
> ```
> int example.foo(immutable(char)[]):
>         cmp     rdi, 5
>         jne     .LBB0_3
>         mov     eax, 1764586337
>         xor     eax, dword ptr [rsi]
>         movzx   ecx, byte ptr [rsi + 4]
>         xor     ecx, 115
>         or      ecx, eax
>         je      .LBB0_2
> .LBB0_3:
>         mov     eax, 100
>         ret
> .LBB0_2:
>         mov     eax, 200
>         ret
> ```
>
> It's amazing.

wow wtf :D
June 06, 2022
On 6/6/2022 10:07 AM, deadalnix wrote:
> LDC is able to inline the lambda and then optimize away the allocation. DMD is not.

DMD can inline it now:

https://issues.dlang.org/show_bug.cgi?id=23165
https://github.com/dlang/dmd/pull/14190

There is a persistent idea that there is something fundamentally wrong with DMD. There isn't. It's just that optimization involves an accumulation of a large number of special cases, and clang has a lot of people adding special cases.

The not-special-case major optimization algorithms that do data flow analysis, dmd has.
June 06, 2022
On 6/5/2022 4:02 AM, Nick Treleaven wrote:
> I meant noreturn expression, not nothrow!

Not no how, not no way!
June 07, 2022
On Tuesday, 7 June 2022 at 01:24:07 UTC, Walter Bright wrote:
>
> There is a persistent idea that there is something fundamentally wrong with DMD. There isn't. It's just that optimization involves an accumulation of a large number of special cases, and clang has a lot of people adding special cases.
>

I have only ever used DMD, never bothered using anything else and it has never hindered any of my work or mattered in the slightest.

Nothing I work with suffers from the difference in optimization as I don't have anything that's real-time sensitive.

As long as the work is done in a reasonable amount of time (That of course depends on what it is.) then I'm fine with it.

Typically the difference is in the milliseconds, which won't matter much for most enterprise work.
June 07, 2022

On Tuesday, 7 June 2022 at 04:53:36 UTC, bauss wrote:

>

On Tuesday, 7 June 2022 at 01:24:07 UTC, Walter Bright wrote:

>

There is a persistent idea that there is something fundamentally wrong with DMD. There isn't. It's just that optimization involves an accumulation of a large number of special cases, and clang has a lot of people adding special cases.

I have only ever used DMD, never bothered using anything else and it has never hindered any of my work or mattered in the slightest.

Nothing I work with suffers from the difference in optimization as I don't have anything that's real-time sensitive.

As long as the work is done in a reasonable amount of time (That of course depends on what it is.) then I'm fine with it.

Typically the difference is in the milliseconds, which won't matter much for most enterprise work.

Mileage varies of course. If watts and/or throughput is important, and you're working on something that admits data parallelism, ldc and gdc can help. Here's an example of a 32X speedup (16X if you cripple the target): https://godbolt.org/z/bT7qKnfMP

June 06, 2022
On 6/6/2022 9:53 PM, bauss wrote:
> Typically the difference is in the milliseconds, which won't matter much for most enterprise work.

Thanks for the kind words! A few years ago, one person posted here that clang invented data flow analysis and I should add it to dmd. I replied with a link to the data flow analysis code in dmd, that was written in 1985 or so :-)

Math is math, and DFA hasn't changed much in the last 40 years.

One amusing thing about it is when my compiler acquired DFA, there was a C compiler benchmark roundup in one of the computer magazines. The results were thrown out for my compiler, as the journalist concluded it had a bug in it where it deleted the test suite code. He wrote lots of bad things as a result.

What actually happened was the DFA deduced that the magazine's benchmark code did not use the result, so threw it away.

The result of that article, though, destroyed my sales for a quarter or so. Then other compilers added DFA, and the benchmark code got fixed for the next compiler roundup.

Sigh.