Jump to page: 1 2
Thread overview
MurmurHash3
Dec 10, 2015
Guillaume Chatelet
Dec 10, 2015
Brad Anderson
Dec 11, 2015
Ilya
Dec 11, 2015
Ilya
Dec 11, 2015
Guillaume Chatelet
Dec 12, 2015
Ilya
Dec 12, 2015
Guillaume Chatelet
Dec 13, 2015
Marc Schütz
Dec 13, 2015
Guillaume Chatelet
Dec 13, 2015
Guillaume Chatelet
Dec 19, 2015
Guillaume Chatelet
Dec 20, 2015
Marc Schütz
December 10, 2015
Here is an implementation of MurmurHash [1] for D.
http://dpaste.dzfl.pl/1b94ed0aa96e

I'll do a proper pull request later on for addition to std.digest if the community feels like it's a valuable addition.

Guillaume

--
1 - https://en.wikipedia.org/wiki/MurmurHash
December 10, 2015
On Thursday, 10 December 2015 at 22:25:21 UTC, Guillaume Chatelet wrote:
> Here is an implementation of MurmurHash [1] for D.
> http://dpaste.dzfl.pl/1b94ed0aa96e
>
> I'll do a proper pull request later on for addition to std.digest if the community feels like it's a valuable addition.
>
> Guillaume
>
> --
> 1 - https://en.wikipedia.org/wiki/MurmurHash

Seems like it'd be good to have it ready and in place as the upcoming containers work starts materializing.
December 11, 2015
On Thursday, 10 December 2015 at 22:25:21 UTC, Guillaume Chatelet wrote:
> Here is an implementation of MurmurHash [1] for D.
> http://dpaste.dzfl.pl/1b94ed0aa96e
>
> I'll do a proper pull request later on for addition to std.digest if the community feels like it's a valuable addition.
>
> Guillaume
>
> --
> 1 - https://en.wikipedia.org/wiki/MurmurHash

Great!

Could you please add an optimized interface to compute hashes for `uint` , `ulong` and `ulong[2]`?

It would very good both for upcoming containers and multidimensional slices https://github.com/D-Programming-Language/phobos/pull/3397 .
December 11, 2015
On Thursday, 10 December 2015 at 22:25:21 UTC, Guillaume Chatelet wrote:
> Here is an implementation of MurmurHash [1] for D.
> http://dpaste.dzfl.pl/1b94ed0aa96e
>
> I'll do a proper pull request later on for addition to std.digest if the community feels like it's a valuable addition.
>
> Guillaume
>
> --
> 1 - https://en.wikipedia.org/wiki/MurmurHash

http://dpaste.dzfl.pl/1b94ed0aa96e#line-222 - seed is uint, can it be ulong?


Mutmur hash has three stages:
1. Computation of hash for blocks (32bit or 128bit)
2. Compitation of hash for tail (remainder)
3. Finalization.

I will be very happy, if step 1 will be represented as an output range. Then it can be used directly like reduce aggregator for ranges and multidimensional slices.

December 11, 2015
On Friday, 11 December 2015 at 01:51:09 UTC, Ilya wrote:
> http://dpaste.dzfl.pl/1b94ed0aa96e#line-222 - seed is uint, can it be ulong?
Done

> Mutmur hash has three stages:
> 1. Computation of hash for blocks (32bit or 128bit)
> 2. Compitation of hash for tail (remainder)
> 3. Finalization.
>
> I will be very happy, if step 1 will be represented as an output range. Then it can be used directly like reduce aggregator for ranges and multidimensional slices.
Done

Not thoroughly tested but updated for range and taking an ulong seed for MurmurHash3_x64_128:
http://dpaste.dzfl.pl/1b94ed0aa96e

Not sure I got what you meant about the optimized version. For the return value ?

I haven't done any benchmarking yet.
December 12, 2015
On Friday, 11 December 2015 at 22:43:00 UTC, Guillaume Chatelet wrote:
> On Friday, 11 December 2015 at 01:51:09 UTC, Ilya wrote:
>> http://dpaste.dzfl.pl/1b94ed0aa96e#line-222 - seed is uint, can it be ulong?
> Done
>
>> Mutmur hash has three stages:
>> 1. Computation of hash for blocks (32bit or 128bit)
>> 2. Compitation of hash for tail (remainder)
>> 3. Finalization.
>>
>> I will be very happy, if step 1 will be represented as an output range. Then it can be used directly like reduce aggregator for ranges and multidimensional slices.
> Done
>
> Not thoroughly tested but updated for range and taking an ulong seed for MurmurHash3_x64_128:
> http://dpaste.dzfl.pl/1b94ed0aa96e
>
> Not sure I got what you meant about the optimized version. For the return value ?
>
> I haven't done any benchmarking yet.

Current version is suitable for arrays but not ranges or types.

Few examples:
1. Compute hash of ulong.
2. Compute hash of all elements in matrix column (element are in different arrays).

I have created output range API draft http://dpaste.dzfl.pl/a24050042758

Ilya

December 12, 2015
On Saturday, 12 December 2015 at 02:59:21 UTC, Ilya wrote:
> Current version is suitable for arrays but not ranges or types.
>
> Few examples:
> 1. Compute hash of ulong.
> 2. Compute hash of all elements in matrix column (element are in different arrays).
>
> I have created output range API draft http://dpaste.dzfl.pl/a24050042758
>
> Ilya

I created https://github.com/gchatelet/murmurhash3_d and updated the code a bit.
It conforms to the digest template interface, allows pushing ulong[2] and accept ranges.

PR welcome :)
December 13, 2015
On Saturday, 12 December 2015 at 20:12:49 UTC, Guillaume Chatelet wrote:
> On Saturday, 12 December 2015 at 02:59:21 UTC, Ilya wrote:
>> Current version is suitable for arrays but not ranges or types.
>>
>> Few examples:
>> 1. Compute hash of ulong.
>> 2. Compute hash of all elements in matrix column (element are in different arrays).
>>
>> I have created output range API draft http://dpaste.dzfl.pl/a24050042758
>>
>> Ilya
>
> I created https://github.com/gchatelet/murmurhash3_d and updated the code a bit.
> It conforms to the digest template interface, allows pushing ulong[2] and accept ranges.
>
> PR welcome :)

AFAICS this doesn't conform to the digest interface. For example, there should be a `finish` method that returns the hash as a static array (see the ExampleDigest [1]). More importantly, I believe your `put()` implementation only works if it is fed the entire data at once. I haven't tested it, but I believe that the following two calls will have a different result, while they should result in the same hash:

    hash.put([1,2,3,4,5,6]);

vs

    hash.put([1,2,3]);
    hash.put([4,5,6]);

[1] http://dlang.org/phobos/std_digest_digest.html#.ExampleDigest
December 13, 2015
On Sunday, 13 December 2015 at 12:44:06 UTC, Marc Schütz wrote:
> AFAICS this doesn't conform to the digest interface. For example, there should be a `finish` method that returns the hash as a static array (see the ExampleDigest [1]).

The structs themselves do not but the alias at the beginning of the file make sure they do.

alias MurmurHash3_x86_32 = Digester!SMurmurHash3_x86_32;
alias MurmurHash3_x86_128 = Digester!SMurmurHash3_x86_128;
alias MurmurHash3_x64_128 = Digester!SMurmurHash3_x64_128;

> More importantly, I believe your `put()` implementation only works if it is fed the entire data at once. I haven't tested it, but I believe that the following two calls will have a different result, while they should result in the same hash:
>
>     hash.put([1,2,3,4,5,6]);
>
> vs
>
>     hash.put([1,2,3]);
>     hash.put([4,5,6]);

I suspect this as well although I haven't tested.
I'll add more tests and add the missing logic if needed.

December 13, 2015
On Sunday, 13 December 2015 at 16:24:35 UTC, Guillaume Chatelet wrote:
> On Sunday, 13 December 2015 at 12:44:06 UTC, Marc Schütz wrote:
>> [...]
>
> The structs themselves do not but the alias at the beginning of the file make sure they do.
>
> alias MurmurHash3_x86_32 = Digester!SMurmurHash3_x86_32;
> alias MurmurHash3_x86_128 = Digester!SMurmurHash3_x86_128;
> alias MurmurHash3_x64_128 = Digester!SMurmurHash3_x64_128;
>
>>     [...]
>
> I suspect this as well although I haven't tested.
> I'll add more tests and add the missing logic if needed.

Fixed in last commit. Thx for the heads up Marc.
« First   ‹ Prev
1 2