View mode: basic / threaded / horizontal-split · Log in · Help
August 15, 2012
Re: The review of std.hash package
On Wed, Aug 15, 2012 at 2:40 AM, RivenTheMage <riven-mage@id.ru> wrote:
> Another example is a systematic error-correcting codes. The
> "only" difference between them and checksums is the ability to
> correct errors, not just detect them. CRC or MD5 can be viewed as
> systematic code with zero error-correcting ability.
>
> Should we mix Reed-Solomon codes and MD5 in one module? I don't think so.

Some people's point is that MD5 was consider a cryptographic digest
function 16 years ago. It is not consider cryptographically secure
today. So why make any design assumption today on how the landscape
will look tomorrow? Specially on a field that is always changing. Why
not lumped them all together and explain the current situation and
recommendation in the comments.

Looks at Python's passlib module for example. They enumerate every
password encoding scheme under the sun (except for scrypt :() and give
a recommendation on the appropriate algorithm to use in the current
computing landscape.
http://packages.python.org/passlib/lib/passlib.hash.html#module-passlib.hash

Thanks,
-Jose
August 15, 2012
Re: The review of std.hash package
On Wednesday, 15 August 2012 at 14:36:00 UTC, José Armando 
García Sancio wrote:
> Some people's point is that MD5 was consider a cryptographic 
> digest
> function 16 years ago. It is not consider cryptographically 
> secure
> today. So why make any design assumption today on how the 
> landscape
> will look tomorrow? Specially on a field that is always 
> changing. Why
> not lumped them all together and explain the current situation 
> and
> recommendation in the comments.
>
> Looks at Python's passlib module for example. They enumerate 
> every
> password encoding scheme under the sun (except for scrypt :() 
> and give
> a recommendation on the appropriate algorithm to use in the 
> current
> computing landscape.
> http://packages.python.org/passlib/lib/passlib.hash.html#module-passlib.hash
>
> Thanks,
> -Jose

I agree that MD5 isn't cryptographically secure anymore, but it 
was designed as a cryptographic hash algorithm, and it shows. 
It's statistical and performance proprieties are completely 
different from CRCs, and no matter how broken, it still has a 
little of cryptographic strength (no practical preimage attack 
was found till this date, for example).

Note that in the Python passlib, there is no mention to CRC, FNV, 
ROT13, etc. Their place is different.
August 15, 2012
Re: The review of std.hash package
On 15-Aug-12 12:45, Kagamin wrote:
> On Wednesday, 15 August 2012 at 08:25:51 UTC, Dmitry Olshansky wrote:
>> Brrr. It's how convenience wrapper works :)
>>
>> And I totally expect this to call the same code and keep the same
>> state during the work.
>>
>> E.g. see std.digest.digest functions digest or hexDigest you could
>> call it stateless in the same vane.
>
> Well there was a wish for stateless hash, Walter even posted the
> required interface:
> auto result = file.byChunk(4096 * 1025).joiner.hash();

auto result = file.byChunk(4096 * 1025).joiner.digest();

and is already supported in the proposal, peek at updated docs.

There is no need for additional methods and whatnot.

-- 
Olshansky Dmitry
August 15, 2012
Re: The review of std.hash package
On Wed, Aug 15, 2012 at 8:11 AM, ReneSac <reneduani@yahoo.com.br> wrote:
>
> Note that in the Python passlib, there is no mention to CRC, FNV, ROT13,
> etc. Their place is different.

Thats because it is a "password module" and nobody or a small
percentage of the population uses CRC for password digest. Note that
the Python passlib module also has archaic plaintext encodings mainly
for interacting with legacy systems.

The basic point is that std.digest/std.hash (whatever people decide)
should probably just have generic digesting algorithm. The user can
decided which one to use given their requirements. Also, it would be
beneficial if the module also includes a section where it recommends
digest based on the current landscape of computing. High-level
documentation and suggestions are easy to change; APIs are not.

Thanks,
-Jose
August 16, 2012
Re: The review of std.hash package
On Wednesday, 15 August 2012 at 19:38:34 UTC, José Armando
García Sancio wrote:

> Thats because it is a "password module" and nobody or a small
> percentage of the population uses CRC for password digest.

In turn, that's because CRC is not not a crytographic hash and
not suited for password hashing :)

> The basic point is that std.digest/std.hash (whatever people 
> decide) should probably just have generic digesting algorithm.

Generic digesting algorithm should probably go into std.algorithm.

It could be used like that:

------------
import std.algorithm;
import std.checksum;
import std.crypto.mdc;

ushort num = 1234;
auto hash1 = hash!("(a >>> 20) ^ (a >>> 12) ^ (a >>> 7) ^ (a >>>
4) ^ a")(str); // indexing hash

string str = "abcd";
auto hash3 = hash!(CRC32)(str); // checksum
auto hash2 = hash!(MD5)(str); // crytographic hash
------------

CRC32 and MD5 are ranges and/or classes, derived from
HashAlgorithm interface.
August 16, 2012
Re: The review of std.hash package
On Thursday, 16 August 2012 at 03:02:59 UTC, RivenTheMage wrote:

> ushort num = 1234;
> auto hash1 = hash!("(a >>> 20) ^ (a >>> 12) ^ (a >>> 7) ^ (a >>>
> 4) ^ a")(str); // indexing hash

I forgot that this case is already covered by reduce!(...)
August 16, 2012
Re: The review of std.hash package
Le 09/08/2012 11:48, Johannes Pfau a écrit :
> Am Wed, 08 Aug 2012 12:31:29 -0700
> schrieb Walter Bright<newshound2@digitalmars.com>:
>
>> On 8/8/2012 12:14 PM, Martin Nowak wrote:
>>> That hardly works for event based programming without using
>>> coroutines. It's the classical inversion-of-control dilemma of
>>> event based programming that forces you to save/restore your state
>>> with every event.
>>
>> See the discussion on using reduce().
>>
>
> I just don't understand it. Let's take the example by Martin Nowak and
> port it to reduce: (The code added as comments is the same code for
> hashes, working with the current API)
>
> int state; //Hash state;
>
> void onData(void[] data)
> {
>       state = reduce(state, data); //copy(data, state);
>       //state = copy(data, state); //also valid, but not necessary
>       //state.put(data); //simple way, doesn't work for ranges
> }
>
> void main()
> {
>       state = 0; //state.start();
>       auto stream = new EventTcpStream("localhost", 80);
>       stream.onData =&onData;
>       //auto result = hash.finish();
> }
>
> There are only 2 differences:
>
> 1:
> the order of the arguments passed to copy and reduce is swapped. This
> kinda makes sense (if copy is interpreted as copyTo). Solution: Provide
> a method copyInto with swapped arguments if consistency is really so
> important.
>
> 2:
> We need an additional call to finish. I can't say it often enough, I
> don't see a sane way to avoid it. Hashes work on blocks, if you didn't
> pass enough data finish will have to fill the rest of the block with
> zeros before you can get the hash value. This operation can't be
> undone. To get a valid result with every call to copy, you'd have to
> always call finish. This is
> * inefficient, you calculate intermediate values you don't need at all
> * you have to copy the hashes state, as you can't continue hashing
>    after finish has been called
>
> and both, the state and the result would have to fit into the one value
> (called seed for reduce). But then it's still not 100% consistent, as
> reduce will return a single value, not some struct including internal
> state.

I'm pretty sure it is possible to pad and finish when a result is 
required without messing up the internal state.
August 17, 2012
Re: The review of std.hash package
On Thu, 16 Aug 2012 21:25:55 +0100, deadalnix <deadalnix@gmail.com> wrote:

> Le 09/08/2012 11:48, Johannes Pfau a écrit :
>> Am Wed, 08 Aug 2012 12:31:29 -0700
>> schrieb Walter Bright<newshound2@digitalmars.com>:
>>
>>> On 8/8/2012 12:14 PM, Martin Nowak wrote:
>>>> That hardly works for event based programming without using
>>>> coroutines. It's the classical inversion-of-control dilemma of
>>>> event based programming that forces you to save/restore your state
>>>> with every event.
>>>
>>> See the discussion on using reduce().
>>>
>>
>> I just don't understand it. Let's take the example by Martin Nowak and
>> port it to reduce: (The code added as comments is the same code for
>> hashes, working with the current API)
>>
>> int state; //Hash state;
>>
>> void onData(void[] data)
>> {
>>       state = reduce(state, data); //copy(data, state);
>>       //state = copy(data, state); //also valid, but not necessary
>>       //state.put(data); //simple way, doesn't work for ranges
>> }
>>
>> void main()
>> {
>>       state = 0; //state.start();
>>       auto stream = new EventTcpStream("localhost", 80);
>>       stream.onData =&onData;
>>       //auto result = hash.finish();
>> }
>>
>> There are only 2 differences:
>>
>> 1:
>> the order of the arguments passed to copy and reduce is swapped. This
>> kinda makes sense (if copy is interpreted as copyTo). Solution: Provide
>> a method copyInto with swapped arguments if consistency is really so
>> important.
>>
>> 2:
>> We need an additional call to finish. I can't say it often enough, I
>> don't see a sane way to avoid it. Hashes work on blocks, if you didn't
>> pass enough data finish will have to fill the rest of the block with
>> zeros before you can get the hash value. This operation can't be
>> undone. To get a valid result with every call to copy, you'd have to
>> always call finish. This is
>> * inefficient, you calculate intermediate values you don't need at all
>> * you have to copy the hashes state, as you can't continue hashing
>>    after finish has been called
>>
>> and both, the state and the result would have to fit into the one value
>> (called seed for reduce). But then it's still not 100% consistent, as
>> reduce will return a single value, not some struct including internal
>> state.
>
> I'm pretty sure it is possible to pad and finish when a result is  
> required without messing up the internal state.

Without copying it?  AFAICR padding/finishing mutates the state, I mean,  
that's the whole point of it.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
August 20, 2012
std.hash review: Update 2
Changelog:
* moved the package to std.digest:
   std.hash.hash --> std.digest.digest
   std.hash.md   --> std.digest.md
   std.hash.sha  --> std.digest.sha
   std.hash.crc  --> std.digest.crc

* make sure the docs are consistent regarding names (digest vs. hash)


Code: (location changed!)
https://github.com/jpf91/phobos/tree/newHash/std/digest
https://github.com/jpf91/phobos/compare/master...newHash

Docs: (location changed!)
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_digest.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_md.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_sha.html
http://dl.dropbox.com/u/24218791/d/phobos/std_digest_crc.html
August 29, 2012
Re: The review of std.hash package
All this discussion on the use of auto in the docs made me notice 
something else about the docs I missed.

I like how ranges are documented and think digest could do the 
same. Instead of an ExampleDigest, just write the details under 
isDigest.

I don't see a need for template the constraint example (D idiom).

This would require changing examples which use ExampleDigest, but 
maybe that should happen anyway since it doesn't exist.

I don't see a reason to change my vote because of this, its all 
documentation.
7 8 9 10 11 12
Top | Discussion index | About this forum | D home