Thread overview
Templates do maybe not need to be that slow (no promises)
Sep 09, 2016
Stefan Koch
Sep 09, 2016
Marco Leise
Sep 09, 2016
Stefan Koch
Sep 09, 2016
Marco Leise
Sep 09, 2016
Iakh
Sep 09, 2016
Stefan Koch
Sep 09, 2016
Iakh
Sep 10, 2016
Stefan Koch
Sep 11, 2016
Stefan Koch
September 09, 2016
Hi Guys,

I keep this short.
There seems to be much more headroom then I had thought.

The Idea is pretty simple.

Consider :
int fn(int padLength)(int a, int b, int c)
{
  /**
   very long function body 1000+ lines
   */
  return result * padLength;
}

This will produce roughly the same code for every instaniation expect for one imul at the end.

This problem is known as template bloat.


There is a direct linear relationship between the generated code and  the template body.
So If the range of change inside the template body can be linked to the changed range in the binray and we link this to the template parameters we can produce a pure function that can give us the change in the binary code when provided with the template parameters.
And the need to rerun the instanciation and code-gen is reduced to just the changed sections.

I am not yet sure if this is viable to implement.

September 09, 2016
Am Fri, 09 Sep 2016 07:56:04 +0000
schrieb Stefan Koch <uplink.coder@googlemail.com>:

> Hi Guys,
> 
> I keep this short.
> There seems to be much more headroom then I had thought.
> 
> The Idea is pretty simple.
> 
> Consider :
> int fn(int padLength)(int a, int b, int c)
> {
>    /**
>     very long function body 1000+ lines
>     */
>    return result * padLength;
> }
> 
> This will produce roughly the same code for every instaniation expect for one imul at the end.
> 
> […]

Don't worry about this special case too much. At least GCC can turn padLength from a runtime argument into a compile-time argument itself, so the need for templates to do a poor man's const-folding is reduced. So in this case the advise is not to use a template.

You said that there is a lot of code-gen and string comparisons going on. Is code-gen already invoked on-demand? I assume with "dmd -o-" code-gen is completely disabled, which is great for ddoc, .di and dependency graph generation.

-- 
Marco

September 09, 2016
On Friday, 9 September 2016 at 09:31:37 UTC, Marco Leise wrote:
>
> Don't worry about this special case too much. At least GCC can turn padLength from a runtime argument into a compile-time argument itself, so the need for templates to do a poor man's const-folding is reduced. So in this case the advise is not to use a template.
>
> You said that there is a lot of code-gen and string comparisons going on. Is code-gen already invoked on-demand? I assume with "dmd -o-" code-gen is completely disabled, which is great for ddoc, .di and dependency graph generation.

This is not what this is about.
This is about cases where you cannot avoid templates because you do type-based operations.

The code above was just an example to illustrate the problem.

September 09, 2016
Am Fri, 09 Sep 2016 10:32:59 +0000
schrieb Stefan Koch <uplink.coder@googlemail.com>:

> On Friday, 9 September 2016 at 09:31:37 UTC, Marco Leise wrote:
> >
> > Don't worry about this special case too much. At least GCC can turn padLength from a runtime argument into a compile-time argument itself, so the need for templates to do a poor man's const-folding is reduced. So in this case the advise is not to use a template.
> 
> This is not what this is about.
> This is about cases where you cannot avoid templates because you
> do type-based operations.
> 
> The code above was just an example to illustrate the problem.

Fair enough. I hope there is a less complex solution that all
compilers could benefit from.

-- 
Marco

September 09, 2016
On Friday, 9 September 2016 at 07:56:04 UTC, Stefan Koch wrote:

I was thinking on adding "opaque" attribute for template arguments
to force template to forget some information about type.
E.g if you use

class A(opaque T) {...}

you can use only pointers/references to T.

Probably compiler could determine it by itself is type used
as opaque or not.


September 09, 2016
On Friday, 9 September 2016 at 15:08:26 UTC, Iakh wrote:
> On Friday, 9 September 2016 at 07:56:04 UTC, Stefan Koch wrote:
>
> I was thinking on adding "opaque" attribute for template arguments
> to force template to forget some information about type.
> E.g if you use
>
> class A(opaque T) {...}
>
> you can use only pointers/references to T.
>
> Probably compiler could determine it by itself is type used
> as opaque or not.

you could use void* in this case and would not need a template at all.
September 09, 2016
On Friday, 9 September 2016 at 15:28:55 UTC, Stefan Koch wrote:
> On Friday, 9 September 2016 at 15:08:26 UTC, Iakh wrote:
>> On Friday, 9 September 2016 at 07:56:04 UTC, Stefan Koch wrote:
>>
>> I was thinking on adding "opaque" attribute for template arguments
>> to force template to forget some information about type.
>> E.g if you use
>>
>> class A(opaque T) {...}
>>
>> you can use only pointers/references to T.
>>
>> Probably compiler could determine it by itself is type used
>> as opaque or not.
>
> you could use void* in this case and would not need a template at all.

And if you wont type-safe code?
With opaque it would be more like Java generics
September 10, 2016
On Friday, 9 September 2016 at 07:56:04 UTC, Stefan Koch wrote:
>
> There is a direct linear relationship between the generated code and  the template body.
> So If the range of change inside the template body can be linked to the changed range in the binray and we link this to the template parameters we can produce a pure function that can give us the change in the binary code when provided with the template parameters.
> And the need to rerun the instanciation and code-gen is reduced to just the changed sections.
>
> I am not yet sure if this is viable to implement.

I think I have found a way to avoid subtree-comparisons for the most part and speed them up significantly for the rest.
At the expense of limiting the number of compile-time entities (types, expressions ... anything)  to a maximum 2^(28) (When using a 64bit id)

September 11, 2016
There are more news.
I wrote about manual template in-lining before, which is a fairly effective in bringing down the compile-time.

Since templates are of course white-box, the compiler can do this automatically for you. Recursive templates will still incur a performance hit but the effects will be lessened. If that gets implemented.

I am currently extending dmds template-code to support more efficient template caching.