December 01, 2017
On 11/30/2017 8:34 PM, Nicholas Wilson wrote:
> What I meant in terms of icache pollution is with the 'cold' is instead of generating:
> 
> if(!cond)
>      _d_assert(__FILE__, __LINE__,message);
> //rest of code
> 
> it should actually generate,
> 
> if (!cond)
>      goto failed;
> //rest of code
> 
> failed:
>       _d_assert(__FILE__, __LINE__,message);//call is cold & out of line. no icache pollution
> 
> I'm not sure that it does that given the triviality of the example, but it looks like it doesn't.

You're right, it would be better to generate code that way. But it currently does not (I should fix that). It's not completely correct that icache isn't polluted. Functions that are tightly coupled can be located adjacent for better cache performance, and the various asserts would push them apart. Also, the conditional jumps may need to be the longer variety due to the longer distance, rather than the 2 byte one.

December 01, 2017
On Friday, 1 December 2017 at 03:23:23 UTC, Walter Bright wrote:
> 26 bytes of inserted Bloaty McBloatface code and 15 bytes of

[WARNING: This post may be considered 'off topic', and may therefore deeply offend people - hopefully those people are hiding me.]

Hey..I like it..'Bloaty McBloatface'... good name for a ferry...

If only Sydney had more 'D programmers' taking the ferry...

http://www.abc.net.au/news/2017-11-13/sydney-ferry-will-actually-be-called-ferry-mcferryface/9146446

December 01, 2017
On Friday, 1 December 2017 at 11:07:32 UTC, Walter Bright wrote:
> On 11/30/2017 8:34 PM, Nicholas Wilson wrote:
>> What I meant in terms of icache pollution is with the 'cold' is instead of generating:
>> 
>> if(!cond)
>>      _d_assert(__FILE__, __LINE__,message);
>> //rest of code
>> 
>> it should actually generate,
>> 
>> if (!cond)
>>      goto failed;
>> //rest of code
>> 
>> failed:
>>       _d_assert(__FILE__, __LINE__,message);//call is cold & out of line. no icache pollution
>> 
>> I'm not sure that it does that given the triviality of the example, but it looks like it doesn't.
>
> You're right, it would be better to generate code that way. But it currently does not (I should fix that).

Great!

> It's not completely correct that icache isn't polluted.

True.

> Functions that are tightly coupled can be located adjacent for better cache performance, and the various asserts would push them apart.

Does DMD optimise for locality?
I would hope co-located functions are either larger than cache lines by a reasonable amount or, if they are small enough, inlined so that the asserts can be aggregated. It is also possible (though I can't comment on how easy it would be to implement) if you are trying to optimise for co-location to have the asserts be completely out of line so that you have

function1
function2
function3
call asserts of function1
call asserts of function2
call asserts of function3

such that the calls to the asserts never appear in the icache at all apart from overlap of e.g. function1's asserts after the end of function3, or one of the the asserts fail.

> Also, the conditional jumps may need to be the longer variety due to the longer distance, rather than the 2 byte one.

Then it becomes a tradeoff, one that I'm glad the compiler is doing for me.

December 01, 2017
On Friday, 1 December 2017 at 11:01:13 UTC, Walter Bright wrote:
> Correction:
>
> https://dlang.org/dmd-windows.html#switch-release
>
> "compile release version, which means not emitting run-time checks for contracts and asserts. Array bounds checking is not done for system and trusted functions, and assertion failures are undefined behaviour."

Right, that's what I was talking about in this post: http://forum.dlang.org/post/luuwsdbfzunjmzbarxyd@forum.dlang.org

"Indeed, but disabling bounds checking in @system code is trivial anyway"

So leaving them only in @safe isn't much help #1 much D code isn't @safe (that might change if it were default, but it isn't), and #2 it is easy to turn off in @system code since the `.ptr[index]` trick works very well.
December 01, 2017
On Monday, 27 November 2017 at 00:14:40 UTC, IM wrote:
> I could add more, but I'm tired of typing. I hope that one day I will overcome my frustrations as well as D becomes a better language that enables me to do what I want easily without standing in my way.

Among recent native languages only some game designer's (forgot who) one follows this design principle, others focus on safety to some degree.
December 01, 2017
On 12/1/17 11:47 AM, Kagamin wrote:

> Among recent native languages only some game designer's (forgot who) one follows this design principle, others focus on safety to some degree.

https://en.wikipedia.org/wiki/Jonathan_Blow#JAI_language

-Steve
December 01, 2017
On Friday, 1 December 2017 at 04:27:35 UTC, Adam D. Ruppe wrote:
> Then, we can tweak the asserts without also killing D's general memory safety victory that bounds checking brings.

The default compilation mode is fine for me, it's just phobos is not written for it.
December 01, 2017
On Friday, 1 December 2017 at 17:05:08 UTC, Steven Schveighoffer wrote:
> On 12/1/17 11:47 AM, Kagamin wrote:
>
>> Among recent native languages only some game designer's (forgot who) one follows this design principle, others focus on safety to some degree.
>
> https://en.wikipedia.org/wiki/Jonathan_Blow#JAI_language
>
> -Steve

Yes, that, in text: https://github.com/BSVino/JaiPrimer/blob/master/JaiPrimer.md
December 01, 2017
On 12/1/2017 3:31 AM, Nicholas Wilson wrote:
> On Friday, 1 December 2017 at 11:07:32 UTC, Walter Bright wrote:
> Does DMD optimise for locality?

No. However, the much-despised Optlink does! It uses the trace.def output from the profiler to set the layout of functions, so that tightly coupled functions are co-located.

  https://digitalmars.com/ctg/trace.html

It's not even just cache locality - rarely used functions can be allocated to pages so they are never even loaded in from disk. (The executable files are demand loaded.) The speed improvement can be dramatic, especially on program startup times, and if the program does a lot of swapping. I don't know if the Linux linker can accept a script file telling it the function layout.

The downside is because it relies on runtime profile information, it is awkward to set up and needs a representative usage test case to drive it.

dmd could potentially use a static call graph to do a better-than-nothing stab at it, but it would only work on code supplied to it as a group on the command line.


> I would hope co-located functions are either larger than cache lines by a reasonable amount or, if they are small enough, inlined so that the asserts can be aggregated. It is also possible (though I can't comment on how easy it would be to implement) if you are trying to optimise for co-location to have the asserts be completely out of line so that you have
> 
> function1
> function2
> function3
> call asserts of function1
> call asserts of function2
> call asserts of function3
> 
> such that the calls to the asserts never appear in the icache at all apart from overlap of e.g. function1's asserts after the end of function3, or one of the the asserts fail.

It's possible, although the jmps to the assert code would now have to be unconditional relocatable jmps which are larger:

    jne L1
    jmp assertcode
L1:


> Then it becomes a tradeoff, one that I'm glad the compiler is doing for me.

Everything about codegen is a tradeoff :-)

December 02, 2017
On 11/30/2017 8:34 PM, Nicholas Wilson wrote:
> I'm not sure that it does that given the triviality of the example, but it looks like it doesn't.

https://github.com/dlang/dmd/pull/7386