October 26, 2011
On Tue, 25 Oct 2011 23:45:58 -0700, Jonathan M Davis wrote:

> On Wednesday, October 26, 2011 06:28:52 Steve Teale wrote:
>> > An easy test is that if the interface takes a T[] as input, consider a range instead. Ditto for output. If an interface takes a File as input, it's a red flag that something is wrong.
>> 
>> I have a question about ranges that occurred to me when I was composing a MySQL result set into a random access range.
>> 
>> To do that I have to provide the save capability defined by
>> 
>> R r1;
>> R r2 = r1.save;
>> 
>> Would it harm the use of ranges as elements of operation sequences if the type of the entity that got saved was not the same as the original range provider type. Something like:
>> 
>> R r1;
>> S r2 = r1.save;
>> r1.restore(r2);
>> 
>> For entities with a good deal of state, that implement a range, storing the whole state, and more significantly, restoring it, may be non- trivial, but saving and restoring the state of the range may be quite a lightweight affair.
> 
> It's likely to cause issues if the type returned by save differs from the original type. From your description, it sounds like the range is over something, and it's that something which you don't want to copy. If that's the case, then the range just doesn't contain that data. Rather, it's a view into the data, so saving it just save that view.

Well, yes Jonathan, that's just what I'm getting at. I want to save just those things that are pertinent to the range/view, not the whole panoply of the result set representation or whatever other object is providing the data that the range is a view of.

I could do it by saving an instance of the object representing the result set containing only the relevant data, but that would be misleading for the user, because it would not match the object it was cloned from in any respect other than representing a range.

Should not the requirement for the thing provided by save be only that it should provide the same view as that provided by the source range at the time of the save, and the same range 'interface'.

Is there an accepted term for the way ranges are defined, i.e. as an entity that satisfies some template evaluating roughly to a bool?

Steve

Steve
October 26, 2011
On Wednesday, October 26, 2011 03:17 Steve Teale wrote:
> On Tue, 25 Oct 2011 23:45:58 -0700, Jonathan M Davis wrote:
> > On Wednesday, October 26, 2011 06:28:52 Steve Teale wrote:
> >> > An easy test is that if the interface takes a T[] as input, consider a range instead. Ditto for output. If an interface takes a File as input, it's a red flag that something is wrong.
> >> 
> >> I have a question about ranges that occurred to me when I was composing a MySQL result set into a random access range.
> >> 
> >> To do that I have to provide the save capability defined by
> >> 
> >> R r1;
> >> R r2 = r1.save;
> >> 
> >> Would it harm the use of ranges as elements of operation sequences if the type of the entity that got saved was not the same as the original range provider type. Something like:
> >> 
> >> R r1;
> >> S r2 = r1.save;
> >> r1.restore(r2);
> >> 
> >> For entities with a good deal of state, that implement a range, storing the whole state, and more significantly, restoring it, may be non- trivial, but saving and restoring the state of the range may be quite a lightweight affair.
> > 
> > It's likely to cause issues if the type returned by save differs from the original type. From your description, it sounds like the range is over something, and it's that something which you don't want to copy. If that's the case, then the range just doesn't contain that data. Rather, it's a view into the data, so saving it just save that view.
> 
> Well, yes Jonathan, that's just what I'm getting at. I want to save just those things that are pertinent to the range/view, not the whole panoply of the result set representation or whatever other object is providing the data that the range is a view of.
> 
> I could do it by saving an instance of the object representing the result set containing only the relevant data, but that would be misleading for the user, because it would not match the object it was cloned from in any respect other than representing a range.
> 
> Should not the requirement for the thing provided by save be only that it should provide the same view as that provided by the source range at the time of the save, and the same range 'interface'.
> 
> Is there an accepted term for the way ranges are defined, i.e. as an entity that satisfies some template evaluating roughly to a bool?

Ranges are defined per templates in std.range. isForwardRange, isInputRange, isRandomAccessRange, etc. And it looks like isForwardRange specifically checks that the return type of save is the same type as the range itself. I was thinking that it didn't necessarily require that but that you'd have definite issues having save return a different type, because it's generally assumed that it's the same type and code will reflect that. However, it looks like it's actually required.

What you should probably do is have the result set be an object holding the data but not be a range itself and then overload opSlice to give you a range over that data (as would be done with a container). Then the range isn't in a position where it's trying to own the data, and it's clear to the programmer where the data is stored.

- Jonathan M Davis
October 26, 2011
On Wed, 26 Oct 2011 13:01:14 -0400, Jonathan M Davis wrote:

> On Wednesday, October 26, 2011 03:17 Steve Teale wrote:
>> On Tue, 25 Oct 2011 23:45:58 -0700, Jonathan M Davis wrote:
>> > On Wednesday, October 26, 2011 06:28:52 Steve Teale wrote:
>> >> > An easy test is that if the interface takes a T[] as input, consider a range instead. Ditto for output. If an interface takes a File as input, it's a red flag that something is wrong.
>> >> 
>> >> I have a question about ranges that occurred to me when I was composing a MySQL result set into a random access range.
>> >> 
>> >> To do that I have to provide the save capability defined by
>> >> 
>> >> R r1;
>> >> R r2 = r1.save;
>> >> 
>> >> Would it harm the use of ranges as elements of operation sequences if the type of the entity that got saved was not the same as the original range provider type. Something like:
>> >> 
>> >> R r1;
>> >> S r2 = r1.save;
>> >> r1.restore(r2);
>> >> 
>> >> For entities with a good deal of state, that implement a range, storing the whole state, and more significantly, restoring it, may be non- trivial, but saving and restoring the state of the range may be quite a lightweight affair.
>> > 
>> > It's likely to cause issues if the type returned by save differs from the original type. From your description, it sounds like the range is over something, and it's that something which you don't want to copy. If that's the case, then the range just doesn't contain that data. Rather, it's a view into the data, so saving it just save that view.
>> 
>> Well, yes Jonathan, that's just what I'm getting at. I want to save just those things that are pertinent to the range/view, not the whole panoply of the result set representation or whatever other object is providing the data that the range is a view of.
>> 
>> I could do it by saving an instance of the object representing the result set containing only the relevant data, but that would be misleading for the user, because it would not match the object it was cloned from in any respect other than representing a range.
>> 
>> Should not the requirement for the thing provided by save be only that it should provide the same view as that provided by the source range at the time of the save, and the same range 'interface'.
>> 
>> Is there an accepted term for the way ranges are defined, i.e. as an entity that satisfies some template evaluating roughly to a bool?
> 
> Ranges are defined per templates in std.range. isForwardRange, isInputRange, isRandomAccessRange, etc. And it looks like isForwardRange specifically checks that the return type of save is the same type as the range itself. I was thinking that it didn't necessarily require that but that you'd have definite issues having save return a different type, because it's generally assumed that it's the same type and code will reflect that. However, it looks like it's actually required.

But is it just required by std.range, or is it required at some theoretical level by algorithms that can work with ranges as inputs and outputs?

> What you should probably do is have the result set be an object holding the data but not be a range itself and then overload opSlice to give you a range over that data (as would be done with a container). Then the range isn't in a position where it's trying to own the data, and it's clear to the programmer where the data is stored.
> 

But in my mind, just from a quick reading of the language definition, slices are closely tied to arrays, which as you have already noted, are not the best example of a range.

Are ranges just a library artefact, or are they supported by the language. They appear to be recognized by foreach, so if they are recognized by D, then we should presumably have operator functions opInputRange, opForwardRange, and so on, whatever those terms might mean.

If not, then facilities like std.algorithm should have a warning notice like cigarettes that says "this facility may not be usable with POD".

Steve


October 26, 2011
On Wednesday, October 26, 2011 19:00:02 Steve Teale wrote:
> On Wed, 26 Oct 2011 13:01:14 -0400, Jonathan M Davis wrote:
> > Ranges are defined per templates in std.range. isForwardRange, isInputRange, isRandomAccessRange, etc. And it looks like isForwardRange specifically checks that the return type of save is the same type as the range itself. I was thinking that it didn't necessarily require that but that you'd have definite issues having save return a different type, because it's generally assumed that it's the same type and code will reflect that. However, it looks like it's actually required.
> 
> But is it just required by std.range, or is it required at some theoretical level by algorithms that can work with ranges as inputs and outputs?

It complicates things to allow for the range returned by save not be the same type as the original range. It's probably possible to alter isForwardRange to allow save to return a different type, but I don't know what all of the side effects of such a decision would be. I'm certain however that it would break a fair bit of code.

> > What you should probably do is have the result set be an object holding the data but not be a range itself and then overload opSlice to give you a range over that data (as would be done with a container). Then the range isn't in a position where it's trying to own the data, and it's clear to the programmer where the data is stored.
> 
> But in my mind, just from a quick reading of the language definition, slices are closely tied to arrays, which as you have already noted, are not the best example of a range.
> 
> Are ranges just a library artefact, or are they supported by the language. They appear to be recognized by foreach, so if they are recognized by D, then we should presumably have operator functions opInputRange, opForwardRange, and so on, whatever those terms might mean.

foreach supports the API of an input range: front, empty, and popFront. There's no need for overloaded operators of any kind. If you want to overload an operator for foreach, then use opApply. foreach is the _only_ place in the language that specfically supports ranges.

Arrays are a poor example of ranges because they don't have an obviously associated container. A dynamic array is a range over a block of memory that the runtime maintains. With a full-blown container such as Array or RedBlackTree, it's the container that holds the data, and a range for one of them is a range over that container. People tend to think of arrays as being containers rather than just slices, so it becomes confusing to people.

It's much easier to properly understand ranges if you think of them as being associated with a container like an iterator would be in C++. They don't normally own the elements that they're iterating over. When they do, you get situations such as input ranges where you can't save them (an input stream being a prime example).

> If not, then facilities like std.algorithm should have a warning notice like cigarettes that says "this facility may not be usable with POD".

I'm afraid that I don't understand why a warning would be necessary. std.range lists the functions that each type of range must implement and it has templates for testing whether a particular type implements those functions for a given sort of range. I suspect that you're misunderstanding something fundamental about ranges.

If you haven't read this article yet - http://www.informit.com/articles/printerfriendly.aspx?p=1407357 - I suggest that you do. It doesn't entirely match what std.range does (at least as far as naming goes) but it does explain the core concepts fairly well and as far as I know is the best explanation on ranges that currently exists.

- Jonathan M Davis
October 29, 2011
On 10/24/2011 05:10 PM, Piotr Szturmaj wrote:
> https://github.com/pszturmaj/phobos/tree/master/std/crypto
>
> This is some early work on std.crypto proposal. Currently only MD5, HMAC
> and all SHA family functions (excluding SHA0 which is very old, broken
> and no longer in use). I plan to add other crypto primitives later.
>
> I know about one SHA1 pull request optimized for SSSE3. I think native
> code must be there to support other non x86 CPUs and SIMD optimization
> may be added at any time later.
>
> Any opinions are welcome. Especially if such design is good or bad, and
> what needs to be changed.
>
> Thanks :)

Are you re-implementing the function kernels your self or are you using an existing implementation? Given what I've heard about things like side-channel attacks using processing times to recover keys, I'd rather not see Phobos use anything written by less than the best expert available.
November 04, 2011
bcs wrote:
> Are you re-implementing the function kernels your self or are you using
> an existing implementation? Given what I've heard about things like
> side-channel attacks using processing times to recover keys, I'd rather
> not see Phobos use anything written by less than the best expert available.

Until now, I implemented some hash functions. There are no branching instructions in their transform() routines, so theoretically processing time is independent of the function input.
November 05, 2011
On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:
> bcs wrote:
>> Are you re-implementing the function kernels your self or are you using
>> an existing implementation? Given what I've heard about things like
>> side-channel attacks using processing times to recover keys, I'd rather
>> not see Phobos use anything written by less than the best expert
>> available.
>
> Until now, I implemented some hash functions. There are no branching
> instructions in their transform() routines, so theoretically processing
> time is independent of the function input.

From my very incomplete memory I found the source I was looking for (I googled for "aes interperative dance") here http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html Look for "Foot-Shooting Prevention Agreement" in one of the images ~20-25% of the way down.

tl;dr; It mentions "cache-based, timing, and other side channel attacks". Unless you can explain to me what those are, in painful detail, I don't think we should trust you to avoid them. Get a good vetted C implementation and wrap it with a nice D API and call it a day.
November 05, 2011
On 11/4/2011 7:52 PM, bcs wrote:
> tl;dr; It mentions "cache-based, timing, and other side channel attacks". Unless
> you can explain to me what those are, in painful detail, I don't think we should
> trust you to avoid them. Get a good vetted C implementation and wrap it with a
> nice D API and call it a day.


You've got a good point. While I'd like to see native D implementations, crypto security is such a big issue we'd probably be better off with your suggestion.
November 20, 2011
bcs wrote:
> On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:
>> bcs wrote:
>>> Are you re-implementing the function kernels your self or are you using
>>> an existing implementation? Given what I've heard about things like
>>> side-channel attacks using processing times to recover keys, I'd rather
>>> not see Phobos use anything written by less than the best expert
>>> available.
>>
>> Until now, I implemented some hash functions. There are no branching
>> instructions in their transform() routines, so theoretically processing
>> time is independent of the function input.
>
>  From my very incomplete memory I found the source I was looking for (I
> googled for "aes interperative dance") here
> http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html
> Look for "Foot-Shooting Prevention Agreement" in one of the images
> ~20-25% of the way down.
>
> tl;dr; It mentions "cache-based, timing, and other side channel
> attacks". Unless you can explain to me what those are, in painful
> detail, I don't think we should trust you to avoid them. Get a good
> vetted C implementation and wrap it with a nice D API and call it a day.

Sorry for late answer.

I didn't implement AES yet, but it can be implemented without lookup tables (countermeasure to cache-timing attacks).

Branch-free code without secret-data dependent memory accesses is free from cache-timing vurnelabilities on most architectures. The remaining are architectures where instruction execution times (cycles) may depend on data (f.i. multiply by 0 may take 1 cycle, and >2 cycles when multiplying by other numbers). I don't know any concrete examples but I realize such cycle-varying instructions _may_ exist. Of course such instructions must be avoided in secure code.

The most hard to avoid side channel attack is Power Analysis. On almost all CPUs instruction power drain depends on the operands. Again multiplying by 0 consume less power than multiplying by other numbers and that may be distinguished on the oscilloscope. There are some countermeasures including operand blinding, but not all crypto algorithms support that.

I only implemented some hashing functions: MD5, SHA1/224/256/384/512 and Tiger1/2. MD5 and SHA have code that is branch-free and does not use any data dependent lookups so their execution time should be constant, making timing attacks harmless.

Unfortunately Tiger function does lookups based on the source bytes and is vurnelable to cache-timing attacks. This may be considered insecure and removed (openssl does not support it either) as making it secure may be not worth it.

The only problem that remain is the power analysis which require physical access to the computer. We need to ask ourselfs to what degree we must secure our code. I'm going to argue that no single implementation is secure to power analysis on all architectures. I think we must stick to cache/branch timing level and forget about power analysis as its scope is beyond D's specification. We simply can't avoid most power analysis attacks because most of the countermeasures belong to the hardware instead of the software level.
November 22, 2011
On 11/20/2011 08:10 AM, Piotr Szturmaj wrote:
> bcs wrote:
>> On 11/04/2011 04:27 AM, Piotr Szturmaj wrote:
>>> bcs wrote:
>>>> Are you re-implementing the function kernels your self or are you using
>>>> an existing implementation? Given what I've heard about things like
>>>> side-channel attacks using processing times to recover keys, I'd rather
>>>> not see Phobos use anything written by less than the best expert
>>>> available.
>>>
>>> Until now, I implemented some hash functions. There are no branching
>>> instructions in their transform() routines, so theoretically processing
>>> time is independent of the function input.
>>
>> From my very incomplete memory I found the source I was looking for (I
>> googled for "aes interperative dance") here
>> http://www.moserware.com/2009/09/stick-figure-guide-to-advanced.html
>> Look for "Foot-Shooting Prevention Agreement" in one of the images
>> ~20-25% of the way down.
>>
>> tl;dr; It mentions "cache-based, timing, and other side channel
>> attacks". Unless you can explain to me what those are, in painful
>> detail, I don't think we should trust you to avoid them. Get a good
>> vetted C implementation and wrap it with a nice D API and call it a day.
>
> Sorry for late answer.

Np, but I still have a number of concerns:

What is the advantage to implementing the kernels of any of these functions in D? Will the code be faster? Smaller? More secure? More maintainable? (On the other hand, the value of doing the API code in D goes with no debate.)

How many people in the D community have the experience and know-how to review the security of an implementation? If there are less than 2 or 3 people who can do that, can we afford to include native kernels? We can't have just one and if we have only two and one leaves for some reason the code becomes un-maintainable for lack of a reviewer. *I* wouldn't be comfortable with less than about 4-5.

>
> I didn't implement AES yet, but it can be implemented without lookup
> tables (countermeasure to cache-timing attacks).
>
> Branch-free code without secret-data dependent memory accesses is free
> from cache-timing vurnelabilities on most architectures. The remaining
> are architectures where instruction execution times (cycles) may depend
> on data (f.i. multiply by 0 may take 1 cycle, and >2 cycles when
> multiplying by other numbers). I don't know any concrete examples but I
> realize such cycle-varying instructions _may_ exist. Of course such
> instructions must be avoided in secure code.
>
> The most hard to avoid side channel attack is Power Analysis. On almost
> all CPUs instruction power drain depends on the operands. Again
> multiplying by 0 consume less power than multiplying by other numbers
> and that may be distinguished on the oscilloscope. There are some
> countermeasures including operand blinding, but not all crypto
> algorithms support that.
>
> I only implemented some hashing functions: MD5, SHA1/224/256/384/512 and
> Tiger1/2. MD5 and SHA have code that is branch-free and does not use any
> data dependent lookups so their execution time should be constant,
> making timing attacks harmless.
>
> Unfortunately Tiger function does lookups based on the source bytes and
> is vurnelable to cache-timing attacks. This may be considered insecure
> and removed (openssl does not support it either) as making it secure may
> be not worth it.
>
> The only problem that remain is the power analysis which require
> physical access to the computer. We need to ask ourselfs to what degree
> we must secure our code. I'm going to argue that no single
> implementation is secure to power analysis on all architectures. I think
> we must stick to cache/branch timing level and forget about power
> analysis as its scope is beyond D's specification. We simply can't avoid
> most power analysis attacks because most of the countermeasures belong
> to the hardware instead of the software level.