August 04, 2013
On 8/4/2013 2:28 AM, Denis Shelomovskij wrote:
> 04.08.2013 1:55, Walter Bright пишет:
>> The execrable existing implementation was scrapped, and the new one uses
>> Windows HeapAlloc().
>>
>> http://ftp.digitalmars.com/snn.lib
>>
>> This is for testing porpoises, and of course for those that Feel Da Need
>> For Speed.
>
> So I suppose you use `HeapFree` too?

Yes.

> Please, be sure you use this Windows API
> BOOL/BOOLEAN bug workaround:
> https://github.com/denis-sh/phobos-additions/blob/e061d1ad282b4793d1c75dfcc20962b99ec842df/unstd/windows/heap.d#L178

That's good to know, thanks!

August 04, 2013
your're right it was RtlAllocateHeap

Am 04.08.2013 11:25, schrieb Denis Shelomovskij:
> 04.08.2013 11:53, dennis luehring пОшет:
>> Am 04.08.2013 09:35, schrieb Walter Bright:
>>> On 8/4/2013 12:19 AM, Joseph Rushton Wakeling wrote:
>>>> On Sunday, 4 August 2013 at 06:07:54 UTC, dennis luehring wrote:
>>>>> ever tested nedmalloc
>>>>> (http://www.nedprod.com/programs/portable/nedmalloc/) or
>>>>> other malloc allocators?
>>>>
>>>> "Windows 7, Linux 3.x, FreeBSD 8, Mac OS X 10.6 all contain
>>>> state-of-the-art
>>>> allocators and no third party allocator is likely to significantly
>>>> improve on
>>>> them in real world results."
>>>>
>>>> So there may be minimal returns from incorporating nedmalloc on
>>>> modern OS's ... ?
>>>
>>> As I wrote earlier, Microsoft has enormous incentive to make HeapXXXX
>>> as fast as
>>> possible, as it will pay dividends for every Microsoft software
>>> product and
>>> software designed for Windows. I'm sure the engineers there know all
>>> about the
>>> various strategies available on the intarnets. Why not take advantage
>>> of their work?
>>
>> HeapAlloc is a forwarder to RtlHeapAlloc and C++ new does call
>> RtlHeapAlloc directly - would it be better to use this kernel32 api
>> directly? (maybe if used in druntime to reduce dll dependencies)
>>
>
> Up to Windows XP (at least) KERNEL32's HeapAlloc function is forwarded
> to RtlAllocateHeap [1] function exported by NTDLL so there is no runtime
> performance overhead.
>
> There is no RtlHeapAlloc function on my Windows XP and I can't find any
> information about it on the web. Looks like a Windows 6.x stuff or a
> mistake in name.
>
> Also note there are tons of errors because of such "slightly different"
> names. If we are talking about "Heap*" functions:
> 1. Incorrect "RtlAllocHeap" name here [2].
> 2. Incorrect "HeapFree" function signature (4-byte BOOL is returned but
> it is just a wrapper of RtlFreeHeap which returns 1-byte BOOLEAN) (fixed
> in Windows 6.x).
>
> [1]
> http://msdn.microsoft.com/en-us/library/windows/hardware/ff552108(v=vs.85).aspx
> [2] http://msdn.microsoft.com/ru-ru/magazine/cc301808(en-us).aspx
>

August 05, 2013
On 03/08/2013 22:55, Walter Bright wrote:
> The execrable existing implementation was scrapped, and the new one uses
> Windows HeapAlloc().
>
> http://ftp.digitalmars.com/snn.lib
>
> This is for testing porpoises, and of course for those that Feel Da Need
> For Speed.


Using the latest DMD and this snn.lib, i'm seeing it take about 11.5 seconds to compile the algorithm unit tests (when i tried it last week, it was taking closer to 17 seconds).

For comparison, the MSVC build takes about 10 seconds on the same machine (Athlon 64X2 6000+).

Keep up the good work :-)

August 05, 2013
Am 04.08.2013 11:28, schrieb Denis Shelomovskij:
> 04.08.2013 1:55, Walter Bright пОшет:
>> The execrable existing implementation was scrapped, and the new one uses
>> Windows HeapAlloc().
>>
>> http://ftp.digitalmars.com/snn.lib
>>
>> This is for testing porpoises, and of course for those that Feel Da Need
>> For Speed.
>
> So I suppose you use `HeapFree` too? Please, be sure you use this
> Windows API BOOL/BOOLEAN bug workaround:
> https://github.com/denis-sh/phobos-additions/blob/e061d1ad282b4793d1c75dfcc20962b99ec842df/unstd/windows/heap.d#L178
>

but please without using two ifs and GetVersion on every free call
August 05, 2013
On 8/5/2013 4:01 AM, Richard Webb wrote:
> Using the latest DMD and this snn.lib, i'm seeing it take about 11.5 seconds to
> compile the algorithm unit tests (when i tried it last week, it was taking
> closer to 17 seconds).
>
> For comparison, the MSVC build takes about 10 seconds on the same machine
> (Athlon 64X2 6000+).
>
> Keep up the good work :-)
>

So I guess the DMC code generator isn't as awful as has been assumed! This is hardly the first time the culprit was a library routine, not the code generator.
August 05, 2013
On Sunday, 4 August 2013 at 09:28:11 UTC, Denis Shelomovskij wrote:
> So I suppose you use `HeapFree` too? Please, be sure you use this Windows API BOOL/BOOLEAN bug workaround:
> https://github.com/denis-sh/phobos-additions/blob/e061d1ad282b4793d1c75dfcc20962b99ec842df/unstd/windows/heap.d#L178

BOOLEAN is either TRUE or FALSE, so it should be ok to check only the least significant byte.
August 05, 2013
On Monday, 5 August 2013 at 21:42:11 UTC, Kagamin wrote:
> On Sunday, 4 August 2013 at 09:28:11 UTC, Denis Shelomovskij wrote:
>> So I suppose you use `HeapFree` too? Please, be sure you use this Windows API BOOL/BOOLEAN bug workaround:
>> https://github.com/denis-sh/phobos-additions/blob/e061d1ad282b4793d1c75dfcc20962b99ec842df/unstd/windows/heap.d#L178
>
> BOOLEAN is either TRUE or FALSE, so it should be ok to check only the least significant byte.

Not in Windows:
typedef BYTE BOOLEAN;
typedef int BOOL;

(c) http://msdn.microsoft.com/en-us/library/windows/desktop/aa383751%28v=vs.85%29.aspx

While ideally it should be TRUE or FALSE, sometimes it isn't.
In fact, for functions that return BOOL, MSDN states the following:
"If the function succeeds, the return value is nonzero."
August 06, 2013
Am 05.08.2013 19:52, schrieb Walter Bright:
> On 8/5/2013 4:01 AM, Richard Webb wrote:
>> Using the latest DMD and this snn.lib, i'm seeing it take about 11.5 seconds to
>> compile the algorithm unit tests (when i tried it last week, it was taking
>> closer to 17 seconds).
>>
>> For comparison, the MSVC build takes about 10 seconds on the same machine
>> (Athlon 64X2 6000+).
>>
>> Keep up the good work :-)
>>
>
> So I guess the DMC code generator isn't as awful as has been assumed! This is
> hardly the first time the culprit was a library routine, not the code generator.
>

don't start the party to early there are still 1.5 seconds left :)
August 06, 2013
On 05/08/2013 18:52, Walter Bright wrote:
> This is hardly the first time the culprit was a library routine


It's possible that other library routines are causing some of the remaining difference from the MSVC build (e.g. the profiler suggests that the DMC build spends somewhat more time inside memcpy than the MSVC build).

Not sure if it's down to implementation or optimization though - might be down to intrinsics/inlining and such? (the proflie for the DMC build says it's using ~1% of its time inside strlen and the profile for the MSVC build doesn't mention it at all, which i guess is because it's using an intrinsic version).



August 06, 2013
On 8/6/2013 5:13 AM, Richard Webb wrote:
> It's possible that other library routines are causing some of the remaining
> difference from the MSVC build (e.g. the profiler suggests that the DMC build
> spends somewhat more time inside memcpy than the MSVC build).
>
> Not sure if it's down to implementation or optimization though - might be down
> to intrinsics/inlining and such? (the proflie for the DMC build says it's using
> ~1% of its time inside strlen and the profile for the MSVC build doesn't mention
> it at all, which i guess is because it's using an intrinsic version).


If it's inlined then it won't show up in the profile. And yes, it's possible MSVC has a faster memcpy(). After all, enormous effort has been poured into memcpy().