Jump to page: 1 212  
Page
Thread overview
The review of std.hash package
Aug 07, 2012
Dmitry Olshansky
Aug 07, 2012
David
Aug 07, 2012
David
Aug 07, 2012
Jonathan M Davis
Aug 07, 2012
Johannes Pfau
Aug 07, 2012
David
Aug 08, 2012
Johannes Pfau
Aug 07, 2012
Piotr Szturmaj
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Piotr Szturmaj
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Jonathan M Davis
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Jonathan M Davis
Aug 09, 2012
Johannes Pfau
OT: scrypt
Aug 08, 2012
Johannes Pfau
Aug 07, 2012
Ary Manzana
Aug 07, 2012
deadalnix
Aug 07, 2012
Johannes Pfau
Aug 07, 2012
Jonathan M Davis
Aug 08, 2012
Regan Heath
Aug 08, 2012
H. S. Teoh
Aug 08, 2012
Tobias Pankrath
Aug 08, 2012
Regan Heath
Aug 08, 2012
Tobias Pankrath
Aug 08, 2012
Regan Heath
Aug 08, 2012
Christophe Travert
Aug 08, 2012
Chris Cain
Aug 08, 2012
Regan Heath
Aug 08, 2012
Christophe Travert
Aug 08, 2012
Chris Cain
Aug 08, 2012
Christophe Travert
Aug 08, 2012
Regan Heath
Aug 08, 2012
Chris Cain
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Jonathan M Davis
Aug 08, 2012
Jonathan M Davis
Aug 15, 2012
RivenTheMage
Aug 15, 2012
David Nadlinger
Aug 15, 2012
RivenTheMage
Aug 15, 2012
RivenTheMage
Aug 15, 2012
ReneSac
Aug 16, 2012
RivenTheMage
Aug 16, 2012
RivenTheMage
Aug 08, 2012
Dmitry Olshansky
Aug 08, 2012
Dmitry Olshansky
Aug 07, 2012
Dmitry Olshansky
Aug 08, 2012
Walter Bright
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Walter Bright
Aug 08, 2012
Piotr Szturmaj
Aug 08, 2012
Walter Bright
Aug 08, 2012
Martin Nowak
Aug 08, 2012
Walter Bright
Aug 09, 2012
Johannes Pfau
Aug 09, 2012
Walter Bright
Aug 16, 2012
deadalnix
Aug 17, 2012
Regan Heath
Aug 08, 2012
deadalnix
Aug 08, 2012
Martin Nowak
Aug 08, 2012
Walter Bright
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Walter Bright
Aug 09, 2012
Johannes Pfau
Aug 09, 2012
David Nadlinger
Aug 09, 2012
Regan Heath
Aug 09, 2012
Dmitry Olshansky
Aug 09, 2012
Dmitry Olshansky
Aug 09, 2012
David Nadlinger
Aug 09, 2012
Jonathan M Davis
Aug 09, 2012
Walter Bright
Aug 09, 2012
Christophe Travert
Aug 12, 2012
Walter Bright
Aug 15, 2012
Kagamin
Aug 15, 2012
Dmitry Olshansky
Aug 15, 2012
Kagamin
Aug 15, 2012
David Nadlinger
Aug 15, 2012
Dmitry Olshansky
Aug 15, 2012
Kagamin
Aug 15, 2012
David Nadlinger
Aug 15, 2012
Dmitry Olshansky
Aug 15, 2012
David Nadlinger
Aug 15, 2012
Kagamin
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Christophe Travert
Aug 08, 2012
Walter Bright
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Walter Bright
Aug 09, 2012
Johannes Pfau
Aug 09, 2012
Walter Bright
Aug 09, 2012
Walter Bright
Aug 09, 2012
Johannes Pfau
Aug 09, 2012
Johannes Pfau
Aug 09, 2012
Jacob Carlborg
Aug 08, 2012
Denis Shelomovskij
Aug 08, 2012
Johannes Pfau
Aug 08, 2012
Christophe Travert
Aug 09, 2012
Vladimir Panteleev
Aug 09, 2012
Johannes Pfau
Aug 09, 2012
Regan Heath
Aug 09, 2012
Johannes Pfau
std.hash review: Update 1
Aug 10, 2012
Johannes Pfau
std.hash review: Update 2
Aug 20, 2012
Johannes Pfau
Aug 29, 2012
Jesse Phillips
Sep 04, 2012
Johannes Pfau
August 07, 2012
	Since the review queue has been mostly silent again I've decided to jump in and manage the one that's ready to go :)

	Today starts the the review of std.hash package by Johannes Pfau. We go with the usual cycle of two weeks for review and one week for voting. Thus review ends on 22th of August, followed by voting that ends on 29th of August.

Description:

std.hash.hash is a new module for Phobos defining an uniform interface for hashes and checksums. It also provides some useful helper functions to deal with this new API.

The std.hash package also includes:

	- MD5 implementation deprecating std.md5 (in std.hash.md, adapted from std.md5);

	- new SHA1 implementation by redstar (in std.hash.sha);

	- CRC32 implementation (in std.hash.crc) based on and deprecating the crc32 module (that's shipped with phobos but not documented).

It only covers hashes which can process data incrementally (in
smaller buffers as opposed to all data at once).

Code:
https://github.com/jpf91/phobos/tree/newHash/std/hash
https://github.com/jpf91/phobos/compare/master...newHash

Docs:
http://dl.dropbox.com/u/24218791/d/phobos/std_hash_hash.html
http://dl.dropbox.com/u/24218791/d/phobos/std_hash_md.html
http://dl.dropbox.com/u/24218791/d/phobos/std_hash_sha.html
http://dl.dropbox.com/u/24218791/d/phobos/std_hash_crc.html

-- 
Dmitry Olshansky
August 07, 2012
Is the API already set in stone?

Using .start and .finish, feels like the use of OpenSSL. Why not making a Hash object not "restartable" (since e.g. MD5.start() sets `this` just to MD5.init) and making finish private and implementing a digest and hexdigest property which calls finish and returns an ubyte[] array (or string for hexdigest), this would also eliminate the need of the `digest` wrapper (e.g. you can mixin these properties with template mixins, so no code duplication)

Also I am not sure if we want to use `hash.put` instead of `hash.update` (which is used by python and openssl). But that's just a minor point (e.g. perl uses .add)

Furthermore, I don't like the `sha1Of` and `md5Of` etc. functions, why not having a static method? e.g. MD5.hexdigest("input"), which returns the hexdigest and not a MD5 object.

One more thing, `.reset` does the same as `start`? If so, why do both exist? (I also find the examples quite confusing, why do you want to reset the hash onject? I would documentate the function but wouldn't use it in the examples)

Well, that's it from my side :)
August 07, 2012
> Also I am not sure if we want to use `hash.put` instead of `hash.update`
> (which is used by python and openssl). But that's just a minor point
> (e.g. perl uses .add)

Well, it's there to implement the Range interface, but still, put doesn't make too much sense for me (maybe personal preference).

August 07, 2012
On 8/7/12 14:39 , Dmitry Olshansky wrote:
> Since the review queue has been mostly silent again I've decided to jump
> in and manage the one that's ready to go :)
>
> Today starts the the review of std.hash package by Johannes Pfau. We go
> with the usual cycle of two weeks for review and one week for voting.
> Thus review ends on 22th of August, followed by voting that ends on 29th
> of August.
>
> Description:
>
> std.hash.hash is a new module for Phobos defining an uniform interface
> for hashes and checksums. It also provides some useful helper functions
> to deal with this new API.
>
> The std.hash package also includes:

I think "std.crypto" is a better name for the package. At first I thought it contained an implementation of a Hash table.

Also note these entries in wikipedia:

http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/Cryptographic_hash_function

Your package provides the later, not just any hash functions, but *crypto*graphic hash functions. :-)

(and yes, I know I'm just discussing the name here, but names *are* important)
August 07, 2012
On Tuesday, August 07, 2012 20:07:07 David wrote:
> Is the API already set in stone?

No. That's the main point of the review process. The API needs to be reviewed and revised as appropriate (as does the implementation, but the API is much harder to fix later, since that tends to cause breakin changes).

- Jonathan M Davis
August 07, 2012
Le 07/08/2012 20:31, Ary Manzana a écrit :
> On 8/7/12 14:39 , Dmitry Olshansky wrote:
>> Since the review queue has been mostly silent again I've decided to jump
>> in and manage the one that's ready to go :)
>>
>> Today starts the the review of std.hash package by Johannes Pfau. We go
>> with the usual cycle of two weeks for review and one week for voting.
>> Thus review ends on 22th of August, followed by voting that ends on 29th
>> of August.
>>
>> Description:
>>
>> std.hash.hash is a new module for Phobos defining an uniform interface
>> for hashes and checksums. It also provides some useful helper functions
>> to deal with this new API.
>>
>> The std.hash package also includes:
>
> I think "std.crypto" is a better name for the package. At first I
> thought it contained an implementation of a Hash table.
>
> Also note these entries in wikipedia:
>
> http://en.wikipedia.org/wiki/Hash_function
> http://en.wikipedia.org/wiki/Cryptographic_hash_function
>
> Your package provides the later, not just any hash functions, but
> *crypto*graphic hash functions. :-)
>
> (and yes, I know I'm just discussing the name here, but names *are*
> important)

You'll find very hard to convince anyone that crc32 is a cryptographic hash function.

And this API is suited for both cryptographic hash and regular hash. Many of them can be added in the future if need is met. I definitively am for std.hash .
August 07, 2012
On Tuesday, August 07, 2012 15:31:57 Ary Manzana wrote:
> > The std.hash package also includes:
> I think "std.crypto" is a better name for the package. At first I thought it contained an implementation of a Hash table.

That doesn't fly, because crc32 is going to be in there, and while it's a hash, it's no good for cryptography.

- Jonathan M Davis
August 07, 2012
On 07-Aug-12 22:31, Ary Manzana wrote:
>
> I think "std.crypto" is a better name for the package. At first I
> thought it contained an implementation of a Hash table.

There is  std.container, so unambiguous for me.
As for std.crypto it's been discussed to death before. Short answer - no, because it's assumed to include other useful hash functions (like crc32, later I expect adler and whatnot).

Also keep in mind that cryptographic hash is more of a status then permanent property. Now that md5 was cracked, it is typically used only as normal hash (e.g. checksum). Same thing would one day happen to SHA family.


-- 
Dmitry Olshansky
August 07, 2012
Am Tue, 07 Aug 2012 20:07:07 +0200
schrieb David <d@dav1d.de>:

> Is the API already set in stone?
No, as Jonathan already mentioned reviewing the API is an important part of the review. (Actually the most important part for this review, as we define one API for many implementations)

> Using .start and .finish, feels like the use of OpenSSL. Why not
> making a Hash object not "restartable" (since e.g. MD5.start() sets
> `this` just to MD5.init)
I'm not an expert in the hash/digest area. We had incompatible APIs for CRC32, MD5 and SHA1 and I proposed we should have a unique interface. As I had some free time and nobody else stepped up I implemented it, asking for expert help in the newsgroup. This probably also explains why this API isn't revolutionary, it's inspired by other designs (tango, .net, ...).

BTW: making it possible to wrap OpenSSL and other C hash libraries was also a goal. This is one reason why a start() function is necessary. OK, that was the intro ;-)

start is here as structs can't have default constructors. For all current hashes it's really mostly useless (except it can be used as a simple way to reset a hash, which is also why the hash structs don't have a reset member), but I just don't know if there are hashes which would require a more complex (runtime) initialization.

Classes don't have a start method as they can do that initialization in the default constructor. But the default constructor can't be used as a cheap way to reset a hash, therefore the reset function was added to the OOP interface.

> and making finish private and implementing a
> digest and hexdigest property
property is not a good idea, as finish must reset the internal state for some objects, so you can't access the property repeatedly. It'd have to be a function.

>which calls finish and returns an
> ubyte[] array (or string for hexdigest)

This could be done and is probably a matter of taste. I prefer the free function in this case, it makes clear that digest() really is a generic function which can be used with any hash. It's also one function less which must be implemented by hash writers. I'd like to keep the number of members required to implement a hash as low as possible.

(This is however unrelated to the actual naming of the functions. If you think 'digest' is not a good name, please let me know)

>, this would also eliminate
> the need of the `digest` wrapper (e.g. you can mixin these properties with template mixins, so no code duplication)

there's actually no code duplication. 'digest' is the only implementation, sha1Of, md5Of are just aliases.

> 
> Also I am not sure if we want to use `hash.put` instead of
> `hash.update` (which is used by python and openssl). But that's just
> a minor point (e.g. perl uses .add)

I initially called it update, I agree it's a better name. But phobos is all about ranges, so we need the put member anyway. Providing an 'update' alias just for the name isn't worth it, imho (see above, keeping the number of hash members low).

> 
> Furthermore, I don't like the `sha1Of` and `md5Of` etc. functions, why not having a static method? e.g. MD5.hexdigest("input"), which returns the hexdigest and not a MD5 object.

Same reason as above, sha1Of, md5Of are just convenience aliases, it could be argued you don't need those at all. You can use 'digest' directly and it will provide exactly the same result. So hash writers don't have to implement anything to support digest, providing an alias is optional and just 1 line of code. A static method would have to be implemented by every hash. Yes you could use mixins to make that painless, but I think it's still to much work. (And mixins can't be documented in DDOC, but that's an unrelated compiler issue)

BTW: it is implemented as a member for the OOP API, as we can just implement it in the base interface, so the actual implementations don't have to care about it.

> 
> One more thing, `.reset` does the same as `start`? If so, why do both exist?
See above. It's because start is used in the template/struct API and structs can't have default constructors, so we need start.

The OOP/class api can use a default constructor. But unlike the start function it can't work as a reset function, so there's an explicit reset function.

>(I also find the examples quite confusing, why do you want to
> reset the hash onject?
My approach was to document how to use the functions, not as much when it makes sense to use them. Maybe I should care more about that?

It's more efficient than allocating a new hash with 'new' (or do you
mean why use reset if finish resets anyway? It's useful if you put some
data into a hash, then decide you don't want the hash (because the
user aborts the operation, network connection is lost,...) but you still
need the hash object later on.) You could just call finish and
disregard the result, but the reset implementation is faster. I agree
there's probably no _strong_ need for reset, but I think it doesn't do
any harm.

>I would documentate the function but wouldn't
> use it in the examples)
> 
> Well, that's it from my side :)

Thanks for your review!

August 07, 2012
Am Tue, 07 Aug 2012 20:46:44 +0200
schrieb deadalnix <deadalnix@gmail.com>:

> You'll find very hard to convince anyone that crc32 is a cryptographic hash function.
And there will hopefully be more hashes in std.hash a some point.

BTW: I also considered splitting hashes into cryptographic and non-cryptographic/checksums. But as we also have some generic parts (currently in std.hash.hash) this would pose the question where to put the generic part? Put it in std.checksum and std.crypto users will complain, put it in std.crypto and std.checksum users won't be happy.

And as was discussed previously what's considered a safe cryptographic hash might change as time goes by.

> And this API is suited for both cryptographic hash and regular hash. Many of them can be added in the future if need is met. I definitively am for std.hash .

We had this package name discussion a few times on the newsgroup and I think on github as well.

I personally don't care about the package name and I'll just choose what the majority thinks is best. Last time it seemed std.hash was the favorite (although hash and digest can probably be used interchangeably)
« First   ‹ Prev
1 2 3 4 5 6 7 8 9 10 11