May 04, 2014
On 04/05/14 21:26, Joseph Rushton Wakeling via Digitalmars-d wrote:
> On 04/05/14 20:56, Nick Sabalausky via Digitalmars-d wrote:
>> So I think it's probably safe to figure this is a uniform distribution unless
>> some expert chimes in and says otherwise.
>>
>> Thanks for the help.
>
> You're very welcome.  Keep me posted on how things go with your implementation! :-)

I should also be thanking you -- this reminded me that I needed to make some key changes to my std.random2 proposal, including this stricter isUniformRNG template:
https://github.com/WebDrake/std.random2/commit/a071d8d182eb28516198bb292a0b45f86f4425fe

How does this look to people?  And would there be interest in seeing this land in the existing std.random ... ?
May 07, 2014
On 5/4/2014 3:26 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:
> On 04/05/14 20:56, Nick Sabalausky via Digitalmars-d wrote:
>> So I think it's probably safe to figure this is a uniform distribution
>> unless
>> some expert chimes in and says otherwise.
>>
>> Thanks for the help.
>
> You're very welcome.  Keep me posted on how things go with your
> implementation! :-)
>

It's now implemented and working in DAuth's HEAD:

https://github.com/Abscissa/DAuth/blob/master/src/dauth/hashdrbg.d

Usage is pretty simple:

----------------------------
import std.algorithm : take;
import std.array : array;
import dauth.sha;
import dauth.hashdrbg;

HashDRBG!uint rand;

// The above is equivalent to:
// HashDRBG!(uint, SHA512, "DAuth") rand;

// Now use rand just like any other RNG in std.random:
uint[] values1 = rand.take(7).array();

// The algorithm specifically supports generating
// arbitrary lengths at once, so can also do:
HashDRBGStream!uint randStream;

ubyte[] values2;
values2.length = 42;

randStream.read(values2);
----------------------------

Next steps are to support Hash_DRBG's optional "additional input" feature, and to see about Phobos-ifying it and making a pull request.

I'll take a look at your proposed std.random too, sorry I haven't had a chance to yet.

May 07, 2014
On 5/7/2014 4:06 PM, Nick Sabalausky wrote:
>
> // The algorithm specifically supports generating
> // arbitrary lengths at once, so can also do:
> HashDRBGStream!uint randStream;
>

Oops, that's supposed to be:

HashDRBGStream!() randStream;
//aka:
// HashDRBGStream!(SHA512, "D Crypto RNG") randStream;

The stream version isn't a range and only supports filling a provided ubyte[] buffer. So no element type.

May 07, 2014
On Wednesday, 7 May 2014 at 20:11:13 UTC, Nick Sabalausky wrote:
> HashDRBGStream!() randStream;
> //aka:
> // HashDRBGStream!(SHA512, "D Crypto RNG") randStream;
>
> The stream version isn't a range and only supports filling a provided ubyte[] buffer. So no element type.

Shouldn't that take an ubyte output range instead?

David
May 07, 2014
On Wednesday, 7 May 2014 at 21:01:20 UTC, David Nadlinger wrote:
> Shouldn't that take an ubyte output range instead?

Erm, scrap that, wasn't thinking.

David
May 08, 2014
On 5/7/2014 5:09 PM, David Nadlinger wrote:
> On Wednesday, 7 May 2014 at 21:01:20 UTC, David Nadlinger wrote:
>> Shouldn't that take an ubyte output range instead?
>
> Erm, scrap that, wasn't thinking.
>

Actually, maybe it should, even if only as an option. Unless I'm the one not thinking now...?

May 08, 2014
On 5/4/2014 4:08 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:
>
> I should also be thanking you -- this reminded me that I needed to make
> some key changes to my std.random2 proposal, including this stricter
> isUniformRNG template:
> https://github.com/WebDrake/std.random2/commit/a071d8d182eb28516198bb292a0b45f86f4425fe
>
>
> How does this look to people?  And would there be interest in seeing
> this land in the existing std.random ... ?

Some good looking stuff in there. The separation is a nice improvement, and having a "shuffle" is something we could certainly use.

About avoiding problems from accidentally duplicating internal state, I'm not sure that changing to classes is really necessary for that. I addressed it in HashDRBG by simply making the internal state static. (Oops, or at least I meant to...fixing now...). The various InputRange instantiations (for the different possible elemement types) all draw from the same source (HashDRBGStream), so this should work out pretty well. I'd imagine the existing std.random algos could do something similar.

May 08, 2014
On 08/05/14 19:18, Nick Sabalausky via Digitalmars-d wrote:
> Some good looking stuff in there. The separation is a nice improvement, and
> having a "shuffle" is something we could certainly use.

Thanks! :-)

There is already a shuffle function in the existing std.random, but it's called randomShuffle:
http://dlang.org/phobos/std_random.html#.randomShuffle

> About avoiding problems from accidentally duplicating internal state, I'm not
> sure that changing to classes is really necessary for that. I addressed it in
> HashDRBG by simply making the internal state static. (Oops, or at least I meant
> to...fixing now...). The various InputRange instantiations (for the different
> possible elemement types) all draw from the same source (HashDRBGStream), so
> this should work out pretty well. I'd imagine the existing std.random algos
> could do something similar.

That seems a problematic fix for me -- doesn't it mean that there can only ever be one instance of any individual RNG?  While many applications wouldn't notice the difference, it's a pretty major breach of encapsulation (and purity).

On a related note -- I like your concept of a random stream.  I think it may be useful to adapt as part of the way in which std.random2.device works.  I need to compare to other Phobos stream code -- isn't Steven Schweighoffer working on up-to-date stream functionality as part of std.io ... ?
May 09, 2014
On 5/8/2014 5:29 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:
> On 08/05/14 19:18, Nick Sabalausky via Digitalmars-d wrote:
>> Some good looking stuff in there. The separation is a nice
>> improvement, and
>> having a "shuffle" is something we could certainly use.
>
> Thanks! :-)
>
> There is already a shuffle function in the existing std.random, but it's
> called randomShuffle:
> http://dlang.org/phobos/std_random.html#.randomShuffle
>

How'd I either miss or forget that?! ;)

>> About avoiding problems from accidentally duplicating internal state,
>> I'm not
>> sure that changing to classes is really necessary for that. I
>> addressed it in
>> HashDRBG by simply making the internal state static. (Oops, or at
>> least I meant
>> to...fixing now...). The various InputRange instantiations (for the
>> different
>> possible elemement types) all draw from the same source
>> (HashDRBGStream), so
>> this should work out pretty well. I'd imagine the existing std.random
>> algos
>> could do something similar.
>
> That seems a problematic fix for me -- doesn't it mean that there can
> only ever be one instance of any individual RNG?

There can technically be multiple "instances", but yea, they're all effectively tied together. However, I'm leaning towards the belief that's "correct behavior" for a RNG. It's *definitely* correct for a crypto RNG - you certainly wouldn't want two crypto RNGs ever generating the same sequence, not even deliberately. That would defeat the whole point of a crypto RNG.

As for ordinary non-crypto RNGs, I honestly can't imaging any purpose for reliably generating the same values other than "playing back" a previous sequence. But if that's what you want, then IMO it's better to record the sequence of emitted values, or even record/replay the higher-level decisions which the randomness was used to influence.

For randomness, record/replay is just less fiddly and error-prone than reliably regenerating values from an algorithm that intentionally tries to imitate non-predictability. One slip-up while regenerating the sequence (example: trying to replay from the same seed on a RNG that has since had a bug fixed, or on a different architecture which the RNG wasn't well-tested on) and then the inaccuracies just cascade. Plus this way you can swap in a non-deterministic RNG and everything will still work. Or more easily skip back/ahead.

I'm just not seeing a legitimate use-case for multiple states of the same RNG engine (at least on the same thread) that wouldn't be better served by different approach.

> While many
> applications wouldn't notice the difference, it's a pretty major breach
> of encapsulation (and purity).

AIUI an RNG is impure by definition. (Well, unless you consider the internal state to be part of the input, but I'd say *that* view breaks encapsulation.)

I'm also unconvinced that it breaks encapsulation anyway. If anything, I think it's better encapsulated since separate instances *won't* behave like they're under "quantum entanglement". It forces separate instances to stay separate.

It *is* a fairly strange situation though. Unlike most encapsulation units, the whole point of an RNG is to be *inconsistent* rather than consistent. So that kinda turns everything on its ear.

>
> On a related note -- I like your concept of a random stream.  I think it
> may be useful to adapt as part of the way in which std.random2.device
> works.  I need to compare to other Phobos stream code -- isn't Steven
> Schweighoffer working on up-to-date stream functionality as part of
> std.io ... ?

Yea, when I wrote it I figured on proposing an optional stream-like interface for std.random. The Hash_DRBG algorithm is inherently based around the idea that each generated number can be arbitrary-length (not only that - generating ex. two 8-byte values with Hash_DRGB is fundamentally different from generating eight 2-byte values, and produces different results). And the OS-specific entropy generators are basically streams, too (on both Win and Posix, you request any number of bytes of randomness).

So that was just a natural way to implement it. But it did strike me as being generally useful, so I'm glad that you agree.

You're right though, the current state of Phobos streams is kind of a stumbling block. Naturally it'd be best to use the new stream interface, but since that isn't in place, that's an issue. I don't know what the state of the rewrite is. Been awhile since I've heard anything about it.

I figure, at the very least, we could just make the stream interfaces for RNGs private. Then we can open it up whenever we finally do have a new std.stream to conform to.

Or we could do an OutputRange instead...but the recent discussion on those makes me a bit hesitant to do so. And I definitely don't wanna perform the contortions to invert it into a non-coroutine-like InputRange.

Hmm, yea, I'm more and more liking the "keep it private for now, and then see what happens" approach...unfortunately... :/

May 09, 2014
On Friday, 9 May 2014 at 00:43:10 UTC, Nick Sabalausky wrote:
> On 5/8/2014 5:29 PM, Joseph Rushton Wakeling via Digitalmars-d wrote:
>> That seems a problematic fix for me -- doesn't it mean that there can
>> only ever be one instance of any individual RNG?
>
> There can technically be multiple "instances", but yea, they're all effectively tied together. However, I'm leaning towards the belief that's "correct behavior" for a RNG. It's *definitely* correct for a crypto RNG - you certainly wouldn't want two crypto RNGs ever generating the same sequence, not even deliberately. That would defeat the whole point of a crypto RNG.
>
> As for ordinary non-crypto RNGs, I honestly can't imaging any purpose for reliably generating the same values other than "playing back" a previous sequence. But if that's what you want, then IMO it's better to record the sequence of emitted values, or even record/replay the higher-level decisions which the randomness was used to influence.
>
> For randomness, record/replay is just less fiddly and error-prone than reliably regenerating values from an algorithm that intentionally tries to imitate non-predictability. One slip-up while regenerating the sequence (example: trying to replay from the same seed on a RNG that has since had a bug fixed, or on a different architecture which the RNG wasn't well-tested on) and then the inaccuracies just cascade. Plus this way you can swap in a non-deterministic RNG and everything will still work. Or more easily skip back/ahead.
>
> I'm just not seeing a legitimate use-case for multiple states of the same RNG engine (at least on the same thread) that wouldn't be better served by different approach.

I strongly disagree here. If a user has explicitly requested a PRNG, they should be able to rely on its most basic property, being deterministic.