June 04, 2009
bearophile wrote:
> Yes, for such tiny benchmarks I have seen several times 10-12 higher allocation performance in Java compared to D1-DMD. But real programs don't use all their time allocating and freeing memory...
> 
> Bye,
> bearophile

For the compiler I'm working on now (in D), I wanted to check the affects of allocation on performance. Using a placement new, the time for lex/parse/semantic/type-infer/codegen (on a really huge in-memory file) went from ~6 seconds to ~4 seconds (don't have the exact timings, and can't repro right now since I'm redoing inference). So I'd say that even in real-world applications, these things have an effect.

Of course, this only applies to programs which allocate and throw away a lot of small objects. This is a style encouraged by Java and C#'s programming models, much less so by, say, C++'s.
June 04, 2009
Frits van Bommel wrote:
>> The new JavaVM with the option I have shown is clearly able to do such things.
>> Can't you take a look at the source code of the JavaVM? :-)
>> There's a huge amount of NIH in the open source :-)
> 
> I suspect the Java VM uses a different internal representation of the code than LLVM does...

HotSpot uses 3-argument SSA for IR, AFAIK... I think LLVM is also SSA-based, right? But the Java source is _quite_ complex.
June 04, 2009
Robert Fraser:
> Of course, this only applies to programs which allocate and throw away a lot of small objects. This is a style encouraged by Java and C#'s programming models, much less so by, say, C++'s.

Right, there are many potential new D programmers coming from Java that may want to use that style that relies a lot on an efficient GC. But you are missing another important style of programming that allocates & frees tons of objects: functional-style programming, where immutable data is the norm. If the D2 language will want to appeal to functional programmers it will have to manage such immutables more efficiently.

Bye,
bearophile
June 10, 2009
LDC is a moving target because it's actively developed, and generally things improve with time. This is a recent change by the quite active Frits van Bommel: http://www.dsource.org/projects/ldc/changeset/1486%3A9ed0695cb93c

This is a cleaned up version discussed in this thread:

import tango.stdc.stdio: printf;
import Integer = tango.text.convert.Integer;

class AllocationItem {
    int value;
    this(int v) { this.value = v; }
}

int foo(int iters) {
    int sum = 0;
    for (int i = 0; i < iters; ++i) {
        auto item = new AllocationItem(i);
        sum += item.value;
    }
    return sum;
}

void main(char[][] args) {
    int iters = Integer.parse(args[1]);
    printf("%d\n", foo(iters));
}


The asm generated by the last LDC (based on DMD v1.045 and llvm 2.6svn (Tue Jun  9 22:34:25 2009)) (this is just the important part of the asm):

foo:
	testl	%eax, %eax
	jle	.LBB2_4
	movl	%eax, %ecx
	xorl	%eax, %eax
	.align	16
.LBB2_2:
	incl	%eax
	cmpl	%ecx, %eax
	jne	.LBB2_2
	leal	-2(%ecx), %eax
	leal	-1(%ecx), %edx
	mull	%edx
	shldl	$31, %eax, %edx
	leal	-1(%edx,%ecx), %eax
	ret
.LBB2_4:
	xorl	%eax, %eax
	ret
*/


This is the same code with "scope" added:

import tango.stdc.stdio: printf;
import Integer = tango.text.convert.Integer;

class AllocationItem {
    int value;
    this(int v) { this.value = v; }
}

int foo(int iters) {
    int sum = 0;
    for (int i = 0; i < iters; ++i) {
        scope auto item = new AllocationItem(i);
        sum += item.value;
    }
    return sum;
}

void main(char[][] args) {
    int iters = Integer.parse(args[1]);
    printf("%d\n", foo(iters));
}

Its asm:

/*
foo:
	pushl	%ebx
	pushl	%edi
	pushl	%esi
	subl	$24, %esp
	testl	%eax, %eax
	jle	.LBB2_4
	movl	%eax, %esi
	xorl	%edi, %edi
	leal	8(%esp), %ebx
	.align	16
.LBB2_2:
	movl	$_D11gc_test2b_d14AllocationItem6__vtblZ, 8(%esp)
	movl	$0, 12(%esp)
	movl	%edi, 16(%esp)
	movl	%ebx, (%esp)
	call	_d_callfinalizer
	incl	%edi
	cmpl	%esi, %edi
	jne	.LBB2_2
	leal	-2(%esi), %eax
	leal	-1(%esi), %ecx
	mull	%ecx
	shldl	$31, %eax, %edx
	leal	-1(%edx,%esi), %eax
	jmp	.LBB2_5
.LBB2_4:
	xorl	%eax, %eax
.LBB2_5:
	addl	$24, %esp
	popl	%esi
	popl	%edi
	popl	%ebx
	ret
*/


The running time:
...$ elaps ./gc_test1 250000000
-1782069568
real	0m0.170s
user	0m0.160s
sys	0m0.010s

The version with "scope":
...$ elaps ./gc_test2 250000000
-1782069568
real	0m6.430s
user	0m6.430s
sys	0m0.000s

(Later I may try again with a less simple and more realistic benchmark, because this is too much a toy to be interesting.)

Bye,
bearophile
July 25, 2009
Diwaker Gupta Wrote:

> I've just started to play around with D, and I'm hoping someone can clarify this. I wrote a very simple program that just allocates lots of objects, in order to benchmark the garbage collector in D. For comparison, I wrote the programs in C++, Java and D:
> C++: http://gist.github.com/122708
> Java: http://gist.github.com/122709
> D: http://gist.github.com/121790
> 
> With an iteration count of 99999999, I get the following numbers:
> JAVA:
> 0:01.60 elapsed, 1.25 user, 0.28 system
> C++:
> 0:04.99 elapsed, 4.97 user, 0.00 system
> D:
> 0:25.28 elapsed, 25.22 user, 0.00 system
> 
> As you can see, D is abysmally slow compared to C++ and Java. This is using the GNU gdc compiler. I'm hoping the community can give me some insight on what is going on.
> 
> Thanks,
> Diwaker

Hi.
Inspired by this idea I changed somehow rules to make it more complicated task. So they are:
1. every "AllocationItem" has references to three other items.
2. generate "n_items" in static array "items" of type "AllocationItem"
with "value" field set to "0"
3. make random connections between all this items by their reference fields
4. iterate "n_iters" times with the following agorithm
 a) create new "AllocationArray" with "value" set to "1"
 b) replace random item from the array "items" with this new item
 c) add connections to this item
 d) remove (variant 1) or change (variant 2) three random connections
 e) now if some object isn't referenced by others it should be removed (GC collected).
5. calculate count  of old items (with "value" set to 0) and new ones
Here are my programs in D and Java. There is no C++ variant, sorry.
In D I used modified for D2 Bill Baxter's weak reference module: http://www.dsource.org/projects/scrapple/browser/trunk/weakref

now results:
1) it works strange, as from time to time it gives different results in D2 (approx. 1 in 10) and it's no by RNG.
2) java's version is awfully slow (may be because it's my second java app, first was 8 jears ago). now I hate java even more.
3) java and D give different results. It's all strange. Maybe I made something wrong.

1 2 3
Next ›   Last »