October 25, 2008
Andrei Alexandrescu wrote:

> Jason House wrote:
>> I ported some monte carlo simulation code from Java to D2, and performance is horrible.
>> 
>> 34% of the execution time is used by std.random.uniform. To my great surprise, 25% of the execution  time is memory allocation (and collection) from that random call. The only candidate source I see is a call to ensure with lazy arguments. The memory allocation occurs at the start of the UniformDistribution call. I assume this is dynamic closure kicking in.
>> 
>> Can anyone verify that this is the case?
>> 
>> 600000 memory allocations per second really kills performance!
> 
> std.random does not use dynamic memory allocation.

This is exactly why so many have complained about the dynamic closure implementation.  You did not intend to use dynamic memory allocation, but it definitely does.  A program with nothing but a loop that calls uniform will show it plain as day in the profiler. (I'm using callgrind)

> Walter is almost done implementing static closures.

Ooh...  Can you elaborate on that?
October 25, 2008
Bill Baxter wrote:
> On Sat, Oct 25, 2008 at 9:59 AM, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>> Jason House wrote:
>>> I ported some monte carlo simulation code from Java to D2, and
>>> performance is horrible.
>>>
>>> 34% of the execution time is used by std.random.uniform. To my great
>>> surprise, 25% of the execution  time is memory allocation (and
>>> collection) from that random call. The only candidate source I see is
>>> a call to ensure with lazy arguments. The memory allocation occurs at
>>> the start of the UniformDistribution call. I assume this is dynamic
>>> closure kicking in.
>>>
>>> Can anyone verify that this is the case?
>>>
>>> 600000 memory allocations per second really kills performance!
>> std.random does not use dynamic memory allocation.
> 
> Well the suggestion is that it may be using dynamic memory allocation
> without intending to because of the dynamic closures.  Are you saying
> that is definitely not the case?

I don't think there's any delegate in use in std.random.

>> Walter is almost done implementing static closures.
> 
> Excellent!  So what strategy is being used?  I hope it's static by
> default, dynamic on request, but your wording suggests otherwise.

I forgot.


Andrei
October 25, 2008
Andrei Alexandrescu wrote:

> I don't think there's any delegate in use in std.random.

Lazy arguments are delegates, and enforce uses lazy arguments

October 25, 2008
Jason House wrote:
> Andrei Alexandrescu wrote:
> 
>> I don't think there's any delegate in use in std.random.
> 
> Lazy arguments are delegates, and enforce uses lazy arguments

Yikes, I see.

Andrei
October 25, 2008
Jason House:
> The following spends 90% of its time in _d_alloc_memory
> void bar(lazy int i){}
> void foo(int i){ bar(i); }
> void main(){ foreach(int i; 1..1000000) foo(i); }
> Compiling with -O -release reduces it to 88% :)

I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1):
http://en.wikipedia.org/wiki/Man_or_boy_test

What syntax can we use to avoid heap allocation? Few ideas:

void bar(lazy int i){} // like D1
void bar(scope lazy int i){} // like D1
void bar(closure int i){} // like current D2

Bye,
bearophile
October 25, 2008
bearophile Wrote:

> Jason House:
> > The following spends 90% of its time in _d_alloc_memory
> > void bar(lazy int i){}
> > void foo(int i){ bar(i); }
> > void main(){ foreach(int i; 1..1000000) foo(i); }
> > Compiling with -O -release reduces it to 88% :)
> 
> I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1):
> http://en.wikipedia.org/wiki/Man_or_boy_test
> 
> What syntax can we use to avoid heap allocation? Few ideas:
> 
> void bar(lazy int i){} // like D1
> void bar(scope lazy int i){} // like D1
> void bar(closure int i){} // like current D2
> 
> Bye,
> bearophile

I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.
October 25, 2008
On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house@gmail.com> wrote:
> bearophile Wrote:
>
>> Jason House:
>> > The following spends 90% of its time in _d_alloc_memory
>> > void bar(lazy int i){}
>> > void foo(int i){ bar(i); }
>> > void main(){ foreach(int i; 1..1000000) foo(i); }
>> > Compiling with -O -release reduces it to 88% :)
>>
>> I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1):
>> http://en.wikipedia.org/wiki/Man_or_boy_test
>>
>> What syntax can we use to avoid heap allocation? Few ideas:
>>
>> void bar(lazy int i){} // like D1
>> void bar(scope lazy int i){} // like D1
>> void bar(closure int i){} // like current D2

This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.

> I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.

I think for a language like D,  hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated.  By that I mean stack allocation (D1 behavior) should be the default.  Then for places where you really want a closure, some other syntax should be chosen.  The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure.  So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.

But apparently nobody who knows anything about what's actually going to happen is involved in this discussion, so I think I'll just pipe down for now.

--bb
October 25, 2008
Bill Baxter Wrote:

> On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house@gmail.com> wrote:
> > bearophile Wrote:
> >
> >> Jason House:
> >> > The following spends 90% of its time in _d_alloc_memory
> >> > void bar(lazy int i){}
> >> > void foo(int i){ bar(i); }
> >> > void main(){ foreach(int i; 1..1000000) foo(i); }
> >> > Compiling with -O -release reduces it to 88% :)
> >>
> >> I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1):
> >> http://en.wikipedia.org/wiki/Man_or_boy_test
> >>
> >> What syntax can we use to avoid heap allocation? Few ideas:
> >>
> >> void bar(lazy int i){} // like D1
> >> void bar(scope lazy int i){} // like D1
> >> void bar(closure int i){} // like current D2
> 
> This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
> 
> > I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.
> 
> I think for a language like D,  hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated.  By that I mean stack allocation (D1 behavior) should be the default.  Then for places where you really want a closure, some other syntax should be chosen.  The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure.  So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.
> 
> But apparently nobody who knows anything about what's actually going to happen is involved in this discussion, so I think I'll just pipe down for now.
> 
> --bb

While I agree that should be the default, I've already seen plenty of D1 code that incorrectly used stack-based closures. It really depends on your usage patterns. I do a lot of inter-thread communication in D1
October 25, 2008
Bill Baxter wrote:

> On Sat, Oct 25, 2008 at 5:24 PM, Jason House <jason.james.house@gmail.com> wrote:
>> bearophile Wrote:
>>
>>> Jason House:
>>> > The following spends 90% of its time in _d_alloc_memory
>>> > void bar(lazy int i){}
>>> > void foo(int i){ bar(i); }
>>> > void main(){ foreach(int i; 1..1000000) foo(i); }
>>> > Compiling with -O -release reduces it to 88% :)
>>>
>>> I see. So I presume it becomes quite difficult for D2 to compute up to the 25th term of this sequence (the D code is in the middle of the page) (it takes just few seconds to run on D1): http://en.wikipedia.org/wiki/Man_or_boy_test
>>>
>>> What syntax can we use to avoid heap allocation? Few ideas:
>>>
>>> void bar(lazy int i){} // like D1
>>> void bar(scope lazy int i){} // like D1
>>> void bar(closure int i){} // like current D2
> 
> This makes no sense because the writer of bar has no idea whether the caller will need a heap allocation or not.
> 
>> I would assume a fix would be to add scope to input delegates and to require some kind of declaration on the caller's side when the compiler can't prove safety. It's best for ambiguous cases to be a warning (error). It also makes the code easier for readers to follow.
> 
> I think for a language like D,  hidden, hard to find memory allocations like the one Andrei didn't know he was doing should be eliminated.  By that I mean stack allocation (D1 behavior) should be the default.  Then for places where you really want a closure, some other syntax should be chosen.  The other reason I say that is that so far in D I've only very seldom really wanted an allocated closure.  So I think I will have to use the funky no-closure-please syntax way more than I would have to use a make-me-a-closure-please syntax.

I agree that D1 behaviour should be the default, since otherwise it'll be yet another breaking change. However, I do understand that the D1 behaviour is the unsafe one, and as such the heap allocated version has merit as the default.

-- 
Lars Ivar Igesund
blog at http://larsivi.net
DSource, #d.tango & #D: larsivi
Dancing the Tango
October 25, 2008
On Sat, 25 Oct 2008 18:36:27 +0400, Lars Ivar Igesund <larsivar@igesund.net> wrote:

> Bill Baxter wrote:
>
>> On Sat, Oct 25, 2008 at 5:24 PM, Jason House
>> <jason.james.house@gmail.com> wrote:
>>> bearophile Wrote:
>>>
>>>> Jason House:
>>>> > The following spends 90% of its time in _d_alloc_memory
>>>> > void bar(lazy int i){}
>>>> > void foo(int i){ bar(i); }
>>>> > void main(){ foreach(int i; 1..1000000) foo(i); }
>>>> > Compiling with -O -release reduces it to 88% :)
>>>>
>>>> I see. So I presume it becomes quite difficult for D2 to compute up to
>>>> the 25th term of this sequence (the D code is in the middle of the page)
>>>> (it takes just few seconds to run on D1):
>>>> http://en.wikipedia.org/wiki/Man_or_boy_test
>>>>
>>>> What syntax can we use to avoid heap allocation? Few ideas:
>>>>
>>>> void bar(lazy int i){} // like D1
>>>> void bar(scope lazy int i){} // like D1
>>>> void bar(closure int i){} // like current D2
>>
>> This makes no sense because the writer of bar has no idea whether the
>> caller will need a heap allocation or not.
>>
>>> I would assume a fix would be to add scope to input delegates and to
>>> require some kind of declaration on the caller's side when the compiler
>>> can't prove safety. It's best for ambiguous cases to be a warning
>>> (error). It also makes the code easier for readers to follow.
>>
>> I think for a language like D,  hidden, hard to find memory
>> allocations like the one Andrei didn't know he was doing should be
>> eliminated.  By that I mean stack allocation (D1 behavior) should be
>> the default.  Then for places where you really want a closure, some
>> other syntax should be chosen.  The other reason I say that is that so
>> far in D I've only very seldom really wanted an allocated closure.  So
>> I think I will have to use the funky no-closure-please syntax way more
>> than I would have to use a make-me-a-closure-please syntax.
>
> I agree that D1 behaviour should be the default, since otherwise it'll be
> yet another breaking change. However, I do understand that the D1 behaviour
> is the unsafe one, and as such the heap allocated version has merit as the
> default.
>

I believe the default should be the one that is most frequently used, even if it is less safe. Otherwise you may end up with a lot of code duplication.

I also think that scope and heap-allocated delegates should have different types so that no imlicit casting from scope delegate to heap one would be possible. In this case callee function that recieves the delegate might demand the delegate to be heap-allocated (because it stores it, for example).