Array toHash()

Nov 27, 2014

David Held

Nov 27, 2014

Ali Çehreli

Nov 29, 2014

David Held

Nov 29, 2014

Ali Çehreli

Nov 30, 2014

David Held

I have a class which contains an int[] and some other stuff. I want to use my class as the key for an AA, so I am overriding toHash(). But the int[] is the only part which should produce the hash code. I know that int[].toHash() is defined somehow, because I can put int[] directly into an AA without writing any hash functions. But I don't know how to spell this explicitly or force the compiler to generate it for me so that I can forward to it in my toHash(). For illustration: class Foo { override size_t toHash() @trusted pure const nothrow { // error: no property 'toHash' for type 'int[]' return importantStuff.toHash(); } // override opEquals() too... int[] importantStuff; bool notImportant; int ignoreMe; } Any way to avoid re-implementing this hash function? Dave

On 11/26/2014 04:25 PM, David Held wrote: > class Foo > { > override > size_t toHash() @trusted pure const nothrow > { > // error: no property 'toHash' for type 'int[]' > return importantStuff.toHash(); > } The getHash() member function of the particular TypeInfo can be used. However, it is not currently pure, so you must comment that out from your toHash: override size_t toHash() @trusted /* pure */ const nothrow { return typeid(importantStuff).getHash(&importantStuff); } If a function can safely be casted to pure, you can use the following yet-missing-in-phobos function template: import std.traits; auto assumePure(T)(T t) if (isFunctionPointer!T || isDelegate!T) { enum attrs = functionAttributes!T | FunctionAttribute.pure_; return cast(SetFunctionAttributes!(T, functionLinkage!T, attrs)) t; } // ... override size_t toHash() @trusted pure const nothrow { auto func = assumePure(&(typeid(importantStuff).getHash)); return func(&importantStuff); } Note that now your toHash can be pure. Ali

On 11/26/2014 4:40 PM, Ali Çehreli wrote: > [...] > override > size_t toHash() @trusted pure const nothrow > { > auto func = assumePure(&(typeid(importantStuff).getHash)); > return func(&importantStuff); > } Very helpful, thanks! Am I right in assuming that there is some runtime cost here, as we are calling through a function pointer? Is the lack of inlining the only cost, or is getting at the typeinfo also costly? Dave

On 11/29/2014 12:45 PM, David Held wrote: > On 11/26/2014 4:40 PM, Ali Çehreli wrote: >> [...] >> override >> size_t toHash() @trusted pure const nothrow >> { >> auto func = assumePure(&(typeid(importantStuff).getHash)); >> return func(&importantStuff); >> } > > Am I right in assuming that there is some runtime > cost here, as we are calling through a function pointer? I think so. However, I think the compiler can eliminate dereferencing the function pointer if it can prove that it is not necessary. > Is the lack of inlining the only cost, or is getting at the > typeinfo also costly? typeid() is a runtime function. I think it will be costly every time toHash is called. The function pointer can be initialized once. // I've "deduced" the type from an error message. ;) static const ulong delegate(const(void*)) const pure nothrow @trusted func; // This will be executed once per thread: static this() { func = assumePure(&(typeid(Foo.init.importantStuff).getHash)); } override size_t toHash() @trusted pure const nothrow { return func(&importantStuff); } Now there is no TypeInfo lookup after thread initialization time. Ali

November 30, 2014

Re: Array toHash()

Posted by David Held
in reply to Ali Çehreli

Permalink

David Held

Posted in reply to Ali Çehreli

Permalink

On 11/29/2014 3:59 PM, Ali Çehreli wrote:
> [...]
> typeid() is a runtime function. I think it will be costly every time
> toHash is called. The function pointer can be initialized once.
>
>      // I've "deduced" the type from an error message. ;)
>      static const ulong delegate(const(void*)) const pure nothrow
> @trusted func;
>
>      // This will be executed once per thread:
>      static this() {
>          func = assumePure(&(typeid(Foo.init.importantStuff).getHash));
>      }
>
>      override
>      size_t toHash() @trusted pure const nothrow
>      {
>          return func(&importantStuff);
>      }
>
> Now there is no TypeInfo lookup after thread initialization time.

Nice!  This is getting very fancy, but also a bit unfortunate.  It would be appropriate if the type of importantStuff were arbitrary, but it's not.  It's int[].  In fact, the fact that I can get a compiler-generated hash function for any type (I presume) surprises me.  Not entirely, since the compiler can just treat types as binary blobs, and define a hash function that operates as if it were byte[].

Although it is nice that we can get at stuff through reflection, this is not quite the same as the ostensibly static [and inlinable] call to int[].toHash() which logically occurs when int[] is used directly as a key.

Also, would it be possible to avoid spelling the delegate type by making func a function static and using an auto return on a named method?

Finally, I find it a little surprising that there's no standard module which implements a quality hash function builder (using something like MurmurHash or CityHash).  This seems like it would be a fairly common requirement for a lot of users.  Are there any non-standard libraries which do this?

Dave

Forums