Thread overview
Just playing with compiler explorer to see assembly line count.
Oct 03, 2017
SrMordred
Oct 03, 2017
rikki cattermole
Oct 03, 2017
SrMordred
Oct 03, 2017
Iain Buclaw
Oct 03, 2017
Daniel Kozak
Oct 03, 2017
SrMordred
October 03, 2017
//D compiled with gdc 5.2 -O3

auto test(int[] arr, int cmp)
{
    int[] r;
    foreach(v ; arr)
        if(v == cmp)r~=v;
    return r;
}
// 51 lines of assembly

auto test(int[] arr, int cmp)
{
  return arr.filter!((v)=>v==cmp).array;
}
//1450 lines... what?

Ok let me look also at c++:
//gcc 7.2 -O3

vector<int> test(vector<int>& arr, int cmp) {
    vector<int> r;
    for(auto v : arr)
        if(v == cmp)r.push_back(v);
    return r;
}
//152 lines. more than D :)

vector<int> test(vector<int>& arr, int cmp) {
    vector<int> r;
    std::copy_if (arr.begin(), arr.end(), std::back_inserter(r),
     [cmp](int i){return i==cmp;} );
    return r;
}

//150 lines. That what i expected earlier with D.

Hmm. let me be 'fair' and use std.container.array just for curiosity:

auto test(ref Array!int arr, int cmp)
{
    Array!int r;
    foreach(v ; arr)
        if(v == cmp)r.insert(v);
    return r;
}

//5542 lines... what??

Someone interested to discuss about this?

Or point me some grotesque mistake.
October 03, 2017
Be warned, x86 cpu's today are not like they were 10 years ago. A good portion of a symbol could be full of nop's and it could end up being faster than the one without them.

Next, compare against ldc, not gdc primarily. Its better maintained and ugh more inline with dmd (its a bit of a mess, lets not go there). Of course nothing wrong with doing both.

std.container.* is basically dead. We need to replace it. We are currently waiting on std.experimental.allocators before going much more further (also a lot of other no-gc stuff).

Compare (on https://d.godbolt.org/ with "ldc -O3" and "gdc -O3"):
---
auto test1(int[] arr, int cmp)
{
    int[] r;
    foreach(v ; arr)
      if(v == cmp)r~=v;
    return r;
}

import std.container.array;
auto test2(ref Array!int arr, int cmp)
{
    Array!int r;
    foreach(v ; arr)
      if(v == cmp)r.insert(v);
    return r;
}
---
October 03, 2017
On Tuesday, 3 October 2017 at 13:53:38 UTC, rikki cattermole wrote:
> Be warned, x86 cpu's today are not like they were 10 years ago. A good portion of a symbol could be full of nop's and it could end up being faster than the one without them.
>
> Next, compare against ldc, not gdc primarily. Its better maintained and ugh more inline with dmd (its a bit of a mess, lets not go there). Of course nothing wrong with doing both.
>
> std.container.* is basically dead. We need to replace it. We are currently waiting on std.experimental.allocators before going much more further (also a lot of other no-gc stuff).
>
> Compare (on https://d.godbolt.org/ with "ldc -O3" and "gdc -O3"):
> ---
> auto test1(int[] arr, int cmp)
> {
>     int[] r;
>     foreach(v ; arr)
>       if(v == cmp)r~=v;
>     return r;
> }
>
> import std.container.array;
> auto test2(ref Array!int arr, int cmp)
> {
>     Array!int r;
>     foreach(v ; arr)
>       if(v == cmp)r.insert(v);
>     return r;
> }
> ---
With ldc the results are similar.
5k+
And I know, im not into performance comparison yet. But you know, less code, more cache friendly (and sometimes better performance).

But my big surprise was with .filter.



October 03, 2017
On Tuesday, 3 October 2017 at 14:07:39 UTC, SrMordred wrote:
> On Tuesday, 3 October 2017 at 13:53:38 UTC, rikki cattermole wrote:
>> Be warned, x86 cpu's today are not like they were 10 years ago. A good portion of a symbol could be full of nop's and it could end up being faster than the one without them.
>>
>> Next, compare against ldc, not gdc primarily. Its better maintained and ugh more inline with dmd (its a bit of a mess, lets not go there). Of course nothing wrong with doing both.
>>
>> std.container.* is basically dead. We need to replace it. We are currently waiting on std.experimental.allocators before going much more further (also a lot of other no-gc stuff).
>>
>> Compare (on https://d.godbolt.org/ with "ldc -O3" and "gdc -O3"):
>> ---
>> auto test1(int[] arr, int cmp)
>> {
>>     int[] r;
>>     foreach(v ; arr)
>>       if(v == cmp)r~=v;
>>     return r;
>> }
>>
>> import std.container.array;
>> auto test2(ref Array!int arr, int cmp)
>> {
>>     Array!int r;
>>     foreach(v ; arr)
>>       if(v == cmp)r.insert(v);
>>     return r;
>> }
>> ---
> With ldc the results are similar.
> 5k+
> And I know, im not into performance comparison yet. But you know, less code, more cache friendly (and sometimes better performance).
>

Well -O3 does not generate cache friendly assembly anyway. For instance, inlining functions regardless of cost.

I'd take the assembly count with a pinch of salt when you are using templated code.

Iain


October 03, 2017
is not bad

https://godbolt.org/g/bSfubs

On Tue, Oct 3, 2017 at 3:19 PM, SrMordred via Digitalmars-d < digitalmars-d@puremagic.com> wrote:

> //D compiled with gdc 5.2 -O3
>
> auto test(int[] arr, int cmp)
> {
>     int[] r;
>     foreach(v ; arr)
>         if(v == cmp)r~=v;
>     return r;
> }
> // 51 lines of assembly
>
> auto test(int[] arr, int cmp)
> {
>   return arr.filter!((v)=>v==cmp).array;
> }
> //1450 lines... what?
>
> Ok let me look also at c++:
> //gcc 7.2 -O3
>
> vector<int> test(vector<int>& arr, int cmp) {
>     vector<int> r;
>     for(auto v : arr)
>         if(v == cmp)r.push_back(v);
>     return r;
> }
> //152 lines. more than D :)
>
> vector<int> test(vector<int>& arr, int cmp) {
>     vector<int> r;
>     std::copy_if (arr.begin(), arr.end(), std::back_inserter(r),
>      [cmp](int i){return i==cmp;} );
>     return r;
> }
>
> //150 lines. That what i expected earlier with D.
>
> Hmm. let me be 'fair' and use std.container.array just for curiosity:
>
> auto test(ref Array!int arr, int cmp)
> {
>     Array!int r;
>     foreach(v ; arr)
>         if(v == cmp)r.insert(v);
>     return r;
> }
>
> //5542 lines... what??
>
> Someone interested to discuss about this?
>
> Or point me some grotesque mistake.
>


October 03, 2017
On Tuesday, 3 October 2017 at 17:15:04 UTC, Daniel Kozak wrote:
> is not bad
>
> https://godbolt.org/g/bSfubs

Thats cool, I never used copy xD.
(but you returned the .copy range, not the 'r' array ;p)

//now with ldc 1.4 and -O3 -release -boundscheck=off

foreach       -> 99 lines
.filter.copy  -> 368 lines
.filter.array -> 1229 lines (1002 lines with -O1)