Thread overview
string hash significant speedup
Aug 10, 2017
Johnson Jones
Aug 10, 2017
Johnson Jones
Aug 10, 2017
HyperParrow
August 10, 2017
when using T[string], hashing is used. Computing the hash is slow(relatively speaking).

Does D cache the hashes? Strings are immutable so there is absolutely no reason why the hash ever need to be computed more than once.

Essentially the hash should be attached to strings like their length and address.

Does D compute the hash of a string every time it is looked up? If so, any way to optimize that it?
August 10, 2017
On 8/10/17 3:36 PM, Johnson Jones wrote:
> when using T[string], hashing is used. Computing the hash is slow(relatively speaking).
> 
> Does D cache the hashes? Strings are immutable so there is absolutely no reason why the hash ever need to be computed more than once.

It computes them on insertion, and caches the result in the structure of the hash table.

-Steve
August 10, 2017
On Thursday, 10 August 2017 at 20:07:35 UTC, Steven Schveighoffer wrote:
> On 8/10/17 3:36 PM, Johnson Jones wrote:
>> when using T[string], hashing is used. Computing the hash is slow(relatively speaking).
>> 
>> Does D cache the hashes? Strings are immutable so there is absolutely no reason why the hash ever need to be computed more than once.
>
> It computes them on insertion, and caches the result in the structure of the hash table.
>
> -S

Thanks. What is the cache size?


August 10, 2017
On Thursday, 10 August 2017 at 20:07:35 UTC, Steven Schveighoffer wrote:
> On 8/10/17 3:36 PM, Johnson Jones wrote:
>> when using T[string], hashing is used. Computing the hash is slow(relatively speaking).
>> 
>> Does D cache the hashes? Strings are immutable so there is absolutely no reason why the hash ever need to be computed more than once.
>
> It computes them on insertion, and caches the result in the structure of the hash table.
>
> -Steve

But i think he speaks about the strings that are tested for inclusion (i.e opIn RHS), not those who are inserted, for which obviously the hash is known.
August 14, 2017
On 8/10/17 5:10 PM, Johnson Jones wrote:
> On Thursday, 10 August 2017 at 20:07:35 UTC, Steven Schveighoffer wrote:
>> On 8/10/17 3:36 PM, Johnson Jones wrote:
>>> when using T[string], hashing is used. Computing the hash is slow(relatively speaking).
>>>
>>> Does D cache the hashes? Strings are immutable so there is absolutely no reason why the hash ever need to be computed more than once.
>>
>> It computes them on insertion, and caches the result in the structure of the hash table.
>>
> 
> Thanks. What is the cache size?

size_t.sizeof

https://github.com/dlang/druntime/blob/master/src/rt/aaA.d#L173

-Steve
August 14, 2017
On 8/10/17 6:30 PM, HyperParrow wrote:
> On Thursday, 10 August 2017 at 20:07:35 UTC, Steven Schveighoffer wrote:
>> On 8/10/17 3:36 PM, Johnson Jones wrote:
>>> when using T[string], hashing is used. Computing the hash is slow(relatively speaking).
>>>
>>> Does D cache the hashes? Strings are immutable so there is absolutely no reason why the hash ever need to be computed more than once.
>>
>> It computes them on insertion, and caches the result in the structure of the hash table.
> 
> But i think he speaks about the strings that are tested for inclusion (i.e opIn RHS), not those who are inserted, for which obviously the hash is known.

If this is the case, then no, there is no cache. It would have to be a cache of the hash per pointer/length combo. I don't know that it's worth the effort, certainly not for built-in AA.

-Steve