December 28, 2011
On Monday, 26 December 2011 at 17:37:17 UTC, Piotr Szturmaj wrote:
> Yes. Here are the results: http://pastebin.com/rD8kiaQy. This is observed only with Windows DMD.

I'd be more interested in seeing the code.

I've done some more research on this. In release builds, DMD on Windows emits a memcpy call for a slice copy. However, the auto-generated memcpy call has slightly less overhead (register/stack shuffling) than a manual memcpy call, which explains the performance difference I was seeing.
January 06, 2012
Vladimir Panteleev wrote:
> On Monday, 26 December 2011 at 17:37:17 UTC, Piotr Szturmaj wrote:
>> Yes. Here are the results: http://pastebin.com/rD8kiaQy. This is
>> observed only with Windows DMD.
>
> I'd be more interested in seeing the code.

Sorry for late answer. For memcpy cases code is the same as in my github Phobos fork. Here is the change to slice copying: http://pastebin.com/EteqEper
>
> I've done some more research on this. In release builds, DMD on Windows
> emits a memcpy call for a slice copy. However, the auto-generated memcpy
> call has slightly less overhead (register/stack shuffling) than a manual
> memcpy call, which explains the performance difference I was seeing.

January 07, 2012
On Friday, 6 January 2012 at 21:10:50 UTC, Piotr Szturmaj wrote:
> Vladimir Panteleev wrote:
>> On Monday, 26 December 2011 at 17:37:17 UTC, Piotr Szturmaj wrote:
>>> Yes. Here are the results: http://pastebin.com/rD8kiaQy. This is
>>> observed only with Windows DMD.
>>
>> I'd be more interested in seeing the code.
>
> Sorry for late answer. For memcpy cases code is the same as in my github Phobos fork. Here is the change to slice copying: http://pastebin.com/EteqEper

I haven't looked at the disassembly yet, but I'd suggest to rewrite your code so that the left side of the assignment is a slice expression beginning with 0. I think DMD will generate optimal code (memcpy with slightly less overhead than a manual call) if you make it clear to the compiler that the left-hand slice and the right-hand slice have the same length.

Also, it looks like the slice version wastes an extra variable (bw).
1 2 3
Next ›   Last »