Thread overview
Caching in druntime TypeInfo classes
Jun 30, 2015
H. S. Teoh
Jun 30, 2015
Dmitry Olshansky
Jun 30, 2015
Mike
Jul 01, 2015
rsw0x
Jul 01, 2015
deadalnix
Jul 01, 2015
Jonathan M Davis
June 30, 2015
While investigating:

	 https://issues.dlang.org/show_bug.cgi?id=4244

I found that the druntime function for computing the hash of static arrays (this also applies to dynamic arrays, btw) is horrendously slow: about 8-9 times slower than the equivalent operation on a POD struct of the same size.

The problem is caused by the call to hasCustomToHash() inside getArrayHash() in object.d, which in turn calls getElement(), which walks the typeinfo tree to find the TypeInfo for the first non-array / typedef type info definition, in order to determine if array elements have a custom toHash method. This walk is done *every single time* the array is hashed, even though the return value never changes for each array type.

So I tried to modify getArrayHash() to cache this information in the TypeInfo, but ran into some roadblocks: since TypeInfo's are supposed to be const, this operation is illegal. Unless I cast away const, but that's a rather dirty hack. The other problem is that the compiler hardcodes the sizes of each TypeInfo instance, so it will refuse to compile object.d anyway if the TypeInfo is expanded to have an extra field for caching the result of hasCustomTohash(). But since we have to modify the compiler now, my reaction was, why not have the compiler compute this value itself? Since the compiler already has all the information needed to compute this value. We don't have to wait till runtime. The only drawback is adding more complexity to the compiler, making it hard for other efforts like SDC to implement D.

What do you guys think? Should hasCustomToHash() be cached somehow in object.o? Or is caching a poor solution, and we should do something else?


T

-- 
Why is it that all of the instruments seeking intelligent life in the universe are pointed away from Earth? -- Michael Beibl
June 30, 2015
On 6/30/15 5:14 PM, H. S. Teoh via Digitalmars-d wrote:
> While investigating:
>
> 	 https://issues.dlang.org/show_bug.cgi?id=4244
>
> I found that the druntime function for computing the hash of static
> arrays (this also applies to dynamic arrays, btw) is horrendously slow:
> about 8-9 times slower than the equivalent operation on a POD struct of
> the same size.
>
> The problem is caused by the call to hasCustomToHash() inside
> getArrayHash() in object.d, which in turn calls getElement(), which
> walks the typeinfo tree to find the TypeInfo for the first non-array /
> typedef type info definition, in order to determine if array elements
> have a custom toHash method. This walk is done *every single time* the
> array is hashed, even though the return value never changes for each
> array type.
>
> So I tried to modify getArrayHash() to cache this information in the
> TypeInfo, but ran into some roadblocks: since TypeInfo's are supposed to
> be const, this operation is illegal. Unless I cast away const, but
> that's a rather dirty hack. The other problem is that the compiler
> hardcodes the sizes of each TypeInfo instance, so it will refuse to
> compile object.d anyway if the TypeInfo is expanded to have an extra
> field for caching the result of hasCustomTohash(). But since we have to
> modify the compiler now, my reaction was, why not have the compiler
> compute this value itself? Since the compiler already has all the
> information needed to compute this value. We don't have to wait till
> runtime. The only drawback is adding more complexity to the compiler,
> making it hard for other efforts like SDC to implement D.
>
> What do you guys think? Should hasCustomToHash() be cached somehow in
> object.o? Or is caching a poor solution, and we should do something
> else?
>

This sounds like a very good job for RTInfo.

But another possibility -- could you just store the flag in the AA itself? Still doesn't help if you have lots of small similarly-typed AAs, but the fact that the AA key type doesn't change should give you this optimization for relatively cheap, no?

-Steve
June 30, 2015
On 01-Jul-2015 00:14, H. S. Teoh via Digitalmars-d wrote:
> While investigating:
>
> 	 https://issues.dlang.org/show_bug.cgi?id=4244
>
> I found that the druntime function for computing the hash of static
> arrays (this also applies to dynamic arrays, btw) is horrendously slow:
> about 8-9 times slower than the equivalent operation on a POD struct of
> the same size.
>
> The problem is caused by the call to hasCustomToHash() inside
> getArrayHash() in object.d, which in turn calls getElement(), which
> walks the typeinfo tree to find the TypeInfo for the first non-array /
> typedef type info definition, in order to determine if array elements
> have a custom toHash method. This walk is done *every single time* the
> array is hashed, even though the return value never changes for each
> array type.
>

Can't the compiler just help us and populate a variable in TypeInfo that is hasCustomToHash is inefficiently computing?


-- 
Dmitry Olshansky
June 30, 2015
On Tuesday, 30 June 2015 at 21:17:13 UTC, H. S. Teoh wrote:
> since TypeInfo's are supposed to be const, this operation is illegal. Unless I cast away const, but that's a rather dirty hack. The other problem is that the compiler hardcodes the sizes of each TypeInfo instance, so it will refuse to compile object.d anyway if the TypeInfo is expanded to have an extra field for caching the result of hasCustomTohash(). But since we have to modify the compiler now, my reaction was, why not have the compiler compute this value itself? Since the compiler already has all the information needed to compute this value. We don't have to wait till runtime. The only drawback is adding more complexity to the compiler, making it hard for other efforts like SDC to implement D.
>

IMO the root cause of the trouble implementing your idea is the fact that the whole TypeInfo implementation is janky.  I think we should implement some flavor of Adam Ruppe's idea to move TypeInfo to the runtime (https://issues.dlang.org/show_bug.cgi?id=12270).  Then the implementation would be more straightforward...and it would solve another major problem that I'm having trying to use D for resource-constrained systems (http://bugzilla.gdcproject.org/show_bug.cgi?id=184) without having to resort to so much hackery.  It may also help reduce binary sizes on other platforms, which has come up a few times recently.

That's probably much more than you bargained for, but I have every intention on implementing it this year, and could use some help.

Mike
July 01, 2015
On Tuesday, 30 June 2015 at 21:17:13 UTC, H. S. Teoh wrote:
> While investigating:
>
> 	 https://issues.dlang.org/show_bug.cgi?id=4244
>
> I found that the druntime function for computing the hash of static arrays (this also applies to dynamic arrays, btw) is horrendously slow: about 8-9 times slower than the equivalent operation on a POD struct of the same size.
>
> The problem is caused by the call to hasCustomToHash() inside getArrayHash() in object.d, which in turn calls getElement(), which walks the typeinfo tree to find the TypeInfo for the first non-array / typedef type info definition, in order to determine if array elements have a custom toHash method. This walk is done *every single time* the array is hashed, even though the return value never changes for each array type.
>
> So I tried to modify getArrayHash() to cache this information in the TypeInfo, but ran into some roadblocks: since TypeInfo's are supposed to be const, this operation is illegal. Unless I cast away const, but that's a rather dirty hack. The other problem is that the compiler hardcodes the sizes of each TypeInfo instance, so it will refuse to compile object.d anyway if the TypeInfo is expanded to have an extra field for caching the result of hasCustomTohash(). But since we have to modify the compiler now, my reaction was, why not have the compiler compute this value itself? Since the compiler already has all the information needed to compute this value. We don't have to wait till runtime. The only drawback is adding more complexity to the compiler, making it hard for other efforts like SDC to implement D.
>
> What do you guys think? Should hasCustomToHash() be cached somehow in object.o? Or is caching a poor solution, and we should do something else?
>
>
> T

This should not use typeinfo at all IMO.

We have template to do exactly that.

July 01, 2015
On Tuesday, 30 June 2015 at 23:23:46 UTC, Mike wrote:
> On Tuesday, 30 June 2015 at 21:17:13 UTC, H. S. Teoh wrote:
>>[...]
>
> IMO the root cause of the trouble implementing your idea is the fact that the whole TypeInfo implementation is janky.  I think we should implement some flavor of Adam Ruppe's idea to move TypeInfo to the runtime (https://issues.dlang.org/show_bug.cgi?id=12270).  Then the implementation would be more straightforward...and it would solve another major problem that I'm having trying to use D for resource-constrained systems (http://bugzilla.gdcproject.org/show_bug.cgi?id=184) without having to resort to so much hackery.  It may also help reduce binary sizes on other platforms, which has come up a few times recently.
>
> [...]

+1
typeinfo situation is ... bad.
July 01, 2015
On Wednesday, 1 July 2015 at 00:48:55 UTC, deadalnix wrote:

> This should not use typeinfo at all IMO.
>
> We have template to do exactly that.

Yeah. It's known at compile time whether a type has a custom toHash function, so it's pretty ridiculous to be looking that sort of thing up at runtime - _especially_ when it's slow to do so.

- Jonathan M Davis