June 04, 2011
Andrei:

> Far as I can tell D comes in the second place after C++ at run time. With optimizations and all it could get significantly closer.

First version, with just classes, a bit better cleaned up: http://codepad.org/DggCx26d

Second version, with all structs: http://codepad.org/etsLsZV5

Tomorrow I'll de-optimize it a bit replacing some structs with classes. And then I'll create one or two more optimized versions (one using a memory pool for the nodes, and one trying to apply some of the C++ improvement ideas from the original paper).

The number of instances allocated:
Class instances:
SimpleLoop_counter            3_936_102
LoopStructureGraph_counter       15_051
UnionFindNode_counter        13_017_663
HavlakLoopFinder_counter         15_051
BasicBlockEdge_counter          378_036
BasicBlock_counter              252_013
MaoCFG_counter                        1

UnionFindNode probably will give some gain if allocated from a pool.

Later,
bearophile
June 04, 2011
On 6/4/11, bearophile <bearophileHUGS@lycos.com> wrote:
> Second version, with all structs: http://codepad.org/etsLsZV5

38secs. Cut down by 10 secs from last time.
June 04, 2011
Gentoo/Linux [gcc version 4.4.5, DMD 2.52, latest GDC with GCC 4.4.5, and latest LDC2]

g++ -O3
[VIRT: 185MB,  RES: 174MB]
real    0m28.407s
user    0m28.330s
sys     0m0.070s

DMD -O -release
[VIRT: 94MB,  RES: 92MB]
real    0m43.232s
user    0m42.980s
sys     0m0.070s

GDC -O3
[VIRT: 306MB,  RES: 295MB]
real    1m10.788s
user    1m10.570s
sys     0m0.190s

LDC2
segmentation fault
June 04, 2011
On 6/4/2011 12:01 AM, Caligo wrote:
> Gentoo/Linux [gcc version 4.4.5, DMD 2.52, latest GDC with GCC 4.4.5,
> and latest LDC2]
>
> g++ -O3
> [VIRT: 185MB,  RES: 174MB]
> real    0m28.407s
> user    0m28.330s
> sys     0m0.070s
>
> DMD -O -release
> [VIRT: 94MB,  RES: 92MB]
> real    0m43.232s
> user    0m42.980s
> sys     0m0.070s
>
> GDC -O3
> [VIRT: 306MB,  RES: 295MB]
> real    1m10.788s
> user    1m10.570s
> sys     0m0.190s
>
> LDC2
> segmentation fault

Why not -inline on dmd?
June 04, 2011
On 6/4/2011 12:01 AM, Caligo wrote:
> Gentoo/Linux [gcc version 4.4.5, DMD 2.52, latest GDC with GCC 4.4.5,
> and latest LDC2]
>
> g++ -O3
> [VIRT: 185MB,  RES: 174MB]
> real    0m28.407s
> user    0m28.330s
> sys     0m0.070s
>
> DMD -O -release
> [VIRT: 94MB,  RES: 92MB]
> real    0m43.232s
> user    0m42.980s
> sys     0m0.070s
>
> GDC -O3
> [VIRT: 306MB,  RES: 295MB]
> real    1m10.788s
> user    1m10.570s
> sys     0m0.190s
>
> LDC2
> segmentation fault

Oh, also, you way want to re-try the benchmark w/ 2.053.  It looks rather allocation heavy and I substantially improved the GC performance for 2.053.
June 04, 2011
On Fri, Jun 3, 2011 at 11:16 PM, dsimcha <dsimcha@yahoo.com> wrote:
> On 6/4/2011 12:01 AM, Caligo wrote:
>>
>> Gentoo/Linux [gcc version 4.4.5, DMD 2.52, latest GDC with GCC 4.4.5, and latest LDC2]
>>
>> g++ -O3
>> [VIRT: 185MB,  RES: 174MB]
>> real    0m28.407s
>> user    0m28.330s
>> sys     0m0.070s
>>
>> DMD -O -release
>> [VIRT: 94MB,  RES: 92MB]
>> real    0m43.232s
>> user    0m42.980s
>> sys     0m0.070s
>>
>> GDC -O3
>> [VIRT: 306MB,  RES: 295MB]
>> real    1m10.788s
>> user    1m10.570s
>> sys     0m0.190s
>>
>> LDC2
>> segmentation fault
>
> Why not -inline on dmd?
>

I don't like the '-inline' option, but here it is.  Besides, I usually use GDC or LDC2 and I was expecting them to outperform DMD because they usually do, but not in this case.


DMD-32bit  v2.52 -O -release -inline
[VIRT: 94MB,  RES: 92MB]
real    0m42.490s
user    0m42.480s
sys     0m0.000s

DMD-32bit  v2.53  -O -release -inline
[VIRT: 107MB, RES: 104MB]
real    0m34.011s
user    0m33.930s
sys     0m0.070s

DMD-64bit  v2.53 -O -release -inline
segmentation fault

DMD-64bit  v2.53 -O -release
[VIRT: 232MB, RES: 219MB]
real    0m44.715s
user    0m44.580s
sys     0m0.080s

P.S.
It's a 64-bit system.
June 04, 2011
And just to be fair to C++:

g++ -O2 -m32
[VIRT: 94MB,  RES: 92MB]
real    0m24.567s
user    0m24.500s
sys     0m0.060s
June 04, 2011
On 6/3/2011 1:24 PM, bearophile wrote:
> The Ada language has a syntax to write those names at the closing ends, and
> the Ada compiler enforces such names to be always coherent and correct. In
> C/C++/D unfortunately such names (written as comments) may go out of sync.

I never understood the point of such.

My editor has a single key "find matching { [ < ( ) > ] }" command, and so I never have a need for such ugly comments. In fact I usually delete such comments when I encounter them.
June 04, 2011
On 6/3/2011 4:09 PM, Adam D. Ruppe wrote:
> This is similar to other benchmarks I've run in the past. Standard
> D builds beat standard C++ builds, but gcc's optimizer takes the
> lead vs dmd's optimizer.

It would be nice to figure out what is different. Try using the coverage analyzer and profiler for starters!
June 04, 2011
Walter:

> It would be nice to figure out what is different. Try using the coverage analyzer and profiler for starters!

There are little differences and inefficiencies here and there, but in the second D version I think most of the performance difference over the C++ code is caused by the GC. I will do some tests.

Bye,
bearophile