Thread overview | ||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
October 25, 2011 Early std.crypto | ||||
---|---|---|---|---|
| ||||
https://github.com/pszturmaj/phobos/tree/master/std/crypto This is some early work on std.crypto proposal. Currently only MD5, HMAC and all SHA family functions (excluding SHA0 which is very old, broken and no longer in use). I plan to add other crypto primitives later. I know about one SHA1 pull request optimized for SSSE3. I think native code must be there to support other non x86 CPUs and SIMD optimization may be added at any time later. Any opinions are welcome. Especially if such design is good or bad, and what needs to be changed. Thanks :) |
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Piotr Szturmaj | On Tue, 25 Oct 2011 02:10:49 +0200, Piotr Szturmaj <bncrbme@jadamspam.pl> wrote:
> https://github.com/pszturmaj/phobos/tree/master/std/crypto
>
> This is some early work on std.crypto proposal. Currently only MD5, HMAC and all SHA family functions (excluding SHA0 which is very old, broken and no longer in use). I plan to add other crypto primitives later.
>
> I know about one SHA1 pull request optimized for SSSE3. I think native code must be there to support other non x86 CPUs and SIMD optimization may be added at any time later.
>
> Any opinions are welcome. Especially if such design is good or bad, and what needs to be changed.
>
> Thanks :)
Great to push this a little.
I have to say though that I like the current struct based interface
much better.
struct Hash
{
// enhanced by some compile time traits
enum hashLength = 16;
enum blockLength = 0;
// three interface functions
void start();
void update(const(ubyte)[] data);
void finish(ref ubyte[hashLength] digest);
}
You wouldn't need the save, restore functions.
Some unnecessary allocations could go away.
Most important instances would have less mutable state.
You could probably parameterize a Merkle Damgård base with free
functions for the transformation.
A dynamic interface can be obtaines by templated instances similar to what std.range does.
|
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | Martin Nowak wrote: > I have to say though that I like the current struct based interface > much better. > > struct Hash > { > // enhanced by some compile time traits > enum hashLength = 16; > enum blockLength = 0; The reason why hash and block length are runtime variables is that some hash functions are parametrized with variables of great amplitude, for example CubeHash may have any number of rounds, and any size of block and hash output. > // three interface functions > void start(); > void update(const(ubyte)[] data); > void finish(ref ubyte[hashLength] digest); > } There, it is: reset(); put(); finish(); The put() function makes hash implementation an OutputRange. > You wouldn't need the save, restore functions. They're not needed. They only serve as speed optimization when hashing many messages which have the same beginning block. This is used in HMAC, which is: HMAC(func, key, message) = func(key ^ opad, func(key ^ ipad, message)); when func supports saving the IV, the first parts are precomputed, when not HMAC resorts to full hashing. This optimization is also mentioned in HMAC spec. > Some unnecessary allocations could go away. > Most important instances would have less mutable state. Could you specify which ones, please? > You could probably parameterize a Merkle Damgård base with free > functions for the transformation. What would be the difference from current class parametrization? > A dynamic interface can be obtaines by templated instances similar to > what std.range does. Could you elaborate? I don't know exactly what do you mean. Function templates? Thanks a lot! |
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Piotr Szturmaj | On Tue, 25 Oct 2011 09:43:48 +0200, Piotr Szturmaj <bncrbme@jadamspam.pl> wrote: > Martin Nowak wrote: >> I have to say though that I like the current struct based interface >> much better. >> >> struct Hash >> { >> // enhanced by some compile time traits >> enum hashLength = 16; >> enum blockLength = 0; > > The reason why hash and block length are runtime variables is that some hash functions are parametrized with variables of great amplitude, for example CubeHash may have any number of rounds, and any size of block and hash output. > >> // three interface functions >> void start(); >> void update(const(ubyte)[] data); >> void finish(ref ubyte[hashLength] digest); >> } > > There, it is: > > reset(); > put(); > finish(); > Reset does two different things depending on the internal state. Not so good. > The put() function makes hash implementation an OutputRange. > >> You wouldn't need the save, restore functions. > > They're not needed. They only serve as speed optimization when hashing many messages which have the same beginning block. This is used in HMAC, which is: > > HMAC(func, key, message) = func(key ^ opad, func(key ^ ipad, message)); > > when func supports saving the IV, the first parts are precomputed, when not HMAC resorts to full hashing. This optimization is also mentioned in HMAC spec. > If hash contexts were value type you could simply do. auto saved = hash_ctx; Or alternatively one could add a 'save()' function to an isSaveableHash(H) concept. >> Some unnecessary allocations could go away. >> Most important instances would have less mutable state. > > Could you specify which ones, please? > Basically every 'new' in std.hash.crypto.base but especially the ones in hash(T) and hashToHex(T). >> You could probably parameterize a Merkle Damgård base with free >> functions for the transformation. > > What would be the difference from current class parametrization? > Just wanted to point out a specific alternative if code reuse is of a concern. If not using classes you need a way to inject the transformation which could be done like. alias MerkleDamgard!(uint, 5, 80, 16, 20, sha1Transform) SHA1; >> A dynamic interface can be obtaines by templated instances similar to >> what std.range does. > > Could you elaborate? I don't know exactly what do you mean. Function templates? > http://www.digitalmars.com/d/2.0/phobos/std_range.html#InputRange DynamicAllocatorTemplate at https://github.com/dsimcha/TempAlloc/blob/master/std/allocators/allocator.d These are to support cases where you either want a stable ABI or have a template firewall for scalability issues (e.g. could be sensible for the HMAC implementation although not really necessary). > Thanks a lot! |
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Piotr Szturmaj | On 10/24/2011 5:10 PM, Piotr Szturmaj wrote:
> https://github.com/pszturmaj/phobos/tree/master/std/crypto
>
> This is some early work on std.crypto proposal. Currently only MD5, HMAC and all SHA family functions (excluding SHA0 which is very old, broken and no longer in use). I plan to add other crypto primitives later.
>
> I know about one SHA1 pull request optimized for SSSE3. I think native code must be there to support other non x86 CPUs and SIMD optimization may be added at any time later.
>
> Any opinions are welcome. Especially if such design is good or bad, and what needs to be changed.
>
> Thanks :)
A key element to a lot of crypto code is speed. I really don't think we want to re-invent all the optimizations on all the platforms. To that end, I really suggest that we stick to wrapping existing implementations, like openssl. While I hate the openssl apis, I do respect the continual effort that various companies invest in optimizing the code.
My 2 cents,
Brad
|
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Brad Roberts | Brad Roberts wrote: > On 10/24/2011 5:10 PM, Piotr Szturmaj wrote: >> https://github.com/pszturmaj/phobos/tree/master/std/crypto >> >> This is some early work on std.crypto proposal. Currently only MD5, HMAC and all SHA family functions (excluding SHA0 >> which is very old, broken and no longer in use). I plan to add other crypto primitives later. >> >> I know about one SHA1 pull request optimized for SSSE3. I think native code must be there to support other non x86 CPUs >> and SIMD optimization may be added at any time later. >> >> Any opinions are welcome. Especially if such design is good or bad, and what needs to be changed. >> >> Thanks :) > > A key element to a lot of crypto code is speed. I really don't think we want to re-invent all the optimizations on all > the platforms. To that end, I really suggest that we stick to wrapping existing implementations, like openssl. While I > hate the openssl apis, I do respect the continual effort that various companies invest in optimizing the code. You are of course right about speed but there are some reasons for having our own code _if_ we want std.crypto: 1. Phobos independence 2. Non D friendly API of openssl 3. No need to link with openssl to compute a simple hash. 4. Licensing We also have some options for speed improvement, while retaining our API 1. Wrap openssl _on demand_, this is transparent to the user, API doesn't change 2. Many (if not all) openssl asm code may be obtained using CRYPTOGAMS license (BSD). But reading http://www.openssl.org/~appro/cryptogams/ suggests that author might be willing to license it under Boost. 3. Adapt Crypto++ asm code which is public domain (x86/64 only) The 1st one should be easy, and user would have choice between openssl wrapped within std.crypto or direct access to etc.c.openssl. |
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Piotr Szturmaj | On 10/24/2011 5:10 PM, Piotr Szturmaj wrote:
> https://github.com/pszturmaj/phobos/tree/master/std/crypto
>
> This is some early work on std.crypto proposal. Currently only MD5, HMAC and all
> SHA family functions (excluding SHA0 which is very old, broken and no longer in
> use). I plan to add other crypto primitives later.
>
> I know about one SHA1 pull request optimized for SSSE3. I think native code must
> be there to support other non x86 CPUs and SIMD optimization may be added at any
> time later.
>
> Any opinions are welcome. Especially if such design is good or bad, and what
> needs to be changed.
Thanks for championing this.
The input to the functions should be a range, not an array (although an array is a range).
In general, for Phobos, all arbitrary input data should be in the form of ranges, and all arbitrary output data should present itself as a range. This facilitates the idea of:
range => algorithm => range
So, for example, I want to encrypt and then zip a file and send the output to a socket:
file => encrypt => compress => socket
All the components here will just "snap" together. With the existing design of crypto, I'd have to read the file into an array, then pass the array to encrypt, etc.
Think of it like the filter concept in Unix that has been so successful.
|
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Martin Nowak | Martin Nowak wrote: > On Tue, 25 Oct 2011 09:43:48 +0200, Piotr Szturmaj > <bncrbme@jadamspam.pl> wrote: > >> Martin Nowak wrote: >>> I have to say though that I like the current struct based interface >>> much better. >>> >>> struct Hash >>> { >>> // enhanced by some compile time traits >>> enum hashLength = 16; >>> enum blockLength = 0; >> >> The reason why hash and block length are runtime variables is that >> some hash functions are parametrized with variables of great >> amplitude, for example CubeHash may have any number of rounds, and any >> size of block and hash output. >> >>> // three interface functions >>> void start(); >>> void update(const(ubyte)[] data); >>> void finish(ref ubyte[hashLength] digest); >>> } >> >> There, it is: >> >> reset(); >> put(); >> finish(); >> > Reset does two different things depending on the internal state. Not so > good. I think that's negligible, but it may be "unbranched" easily. >> The put() function makes hash implementation an OutputRange. >> >>> You wouldn't need the save, restore functions. >> >> They're not needed. They only serve as speed optimization when hashing >> many messages which have the same beginning block. This is used in >> HMAC, which is: >> >> HMAC(func, key, message) = func(key ^ opad, func(key ^ ipad, message)); >> >> when func supports saving the IV, the first parts are precomputed, >> when not HMAC resorts to full hashing. This optimization is also >> mentioned in HMAC spec. >> > If hash contexts were value type you could simply do. > auto saved = hash_ctx; > Or alternatively one could add a 'save()' function to an isSaveableHash(H) > concept. Yes, I thought about that, but current way is faster because only initialization vector is saved, and api name emphasises it. General save() on SHA512 would need to copy IV and 80 ulongs (internal state). >>> Some unnecessary allocations could go away. >>> Most important instances would have less mutable state. >> >> Could you specify which ones, please? >> > Basically every 'new' in std.hash.crypto.base but especially the ones in > hash(T) and hashToHex(T). Yes, this will be fixed. >>> You could probably parameterize a Merkle Damgård base with free >>> functions for the transformation. >> >> What would be the difference from current class parametrization? >> > Just wanted to point out a specific alternative if code reuse is of a > concern. > If not using classes you need a way to inject the transformation which > could > be done like. > alias MerkleDamgard!(uint, 5, 80, 16, 20, sha1Transform) SHA1; Either way this function must be written. Either as free function or as class method. I don't think someone would use transformation function directly. >>> A dynamic interface can be obtaines by templated instances similar to >>> what std.range does. >> >> Could you elaborate? I don't know exactly what do you mean. Function >> templates? >> > http://www.digitalmars.com/d/2.0/phobos/std_range.html#InputRange > DynamicAllocatorTemplate at > https://github.com/dsimcha/TempAlloc/blob/master/std/allocators/allocator.d > > These are to support cases where you either want a stable ABI or > have a template firewall for scalability issues (e.g. could be sensible > for the HMAC implementation although not really necessary). > I will look into it. Thanks! |
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Walter Bright | Walter Bright wrote:
> On 10/24/2011 5:10 PM, Piotr Szturmaj wrote:
>> https://github.com/pszturmaj/phobos/tree/master/std/crypto
>>
>> This is some early work on std.crypto proposal. Currently only MD5,
>> HMAC and all
>> SHA family functions (excluding SHA0 which is very old, broken and no
>> longer in
>> use). I plan to add other crypto primitives later.
>>
>> I know about one SHA1 pull request optimized for SSSE3. I think native
>> code must
>> be there to support other non x86 CPUs and SIMD optimization may be
>> added at any
>> time later.
>>
>> Any opinions are welcome. Especially if such design is good or bad,
>> and what
>> needs to be changed.
>
>
> Thanks for championing this.
>
> The input to the functions should be a range, not an array (although an
> array is a range).
>
> In general, for Phobos, all arbitrary input data should be in the form
> of ranges, and all arbitrary output data should present itself as a
> range. This facilitates the idea of:
>
> range => algorithm => range
>
> So, for example, I want to encrypt and then zip a file and send the
> output to a socket:
>
> file => encrypt => compress => socket
>
> All the components here will just "snap" together. With the existing
> design of crypto, I'd have to read the file into an array, then pass the
> array to encrypt, etc.
>
> Think of it like the filter concept in Unix that has been so successful.
>
I share your opinion. I was thinking about such filter concept for std.crypto.cipher (TBD), but I will also try to convert current hash function code to ranges.
Thanks for pointing that out.
|
October 25, 2011 Re: Early std.crypto | ||||
---|---|---|---|---|
| ||||
Posted in reply to Piotr Szturmaj | On 10/25/2011 3:40 PM, Piotr Szturmaj wrote:
> I share your opinion. I was thinking about such filter concept for
> std.crypto.cipher (TBD), but I will also try to convert current hash function
> code to ranges.
>
> Thanks for pointing that out.
Andrei and I have pretty much failed at articulating this vision for Phobos. We need to get our act together.
|
Copyright © 1999-2021 by the D Language Foundation