March 20, 2014
On Thursday, 20 March 2014 at 01:32:41 UTC, Chris Williams wrote:
> Any chance that you could describe them? I was about to resume porting the dcrypt library into Phobos, and had intended to flip the classes into structs, to match what the rest of the library was doing.

I think there's a good case for a std.random2.crypto module that contains RNGs that are specifically suitable for cryptography.  That said I think the bar here has to be set VERY high, which is why I didn't even begin working on it yet.  It has been argued by some that where crypto in Phobos is concerned, we shouldn't take community contributions but we should hire security experts to write the functionality for us.

Anyway, let's keep in touch about this and discuss how we could support one another's efforts.

About the issues with value-type RNGs (as monarch_dodra says, it's not structs vs. classes per se, as you can implement reference types via structs; it's just more finnicky to do so), probably the best starting point is to read through the various bugs that have been reported as a result of this:
https://d.puremagic.com/issues/show_bug.cgi?id=7067
https://d.puremagic.com/issues/show_bug.cgi?id=7936
https://d.puremagic.com/issues/show_bug.cgi?id=8247
https://d.puremagic.com/issues/show_bug.cgi?id=10322

Although some of these are marked as fixed, the fixes are pretty unpleasant and are workarounds rather than solutions of the underlying problem.  It may look like only a few issues, but the implications are nasty.  We had extensive discussions about this over the last year:
http://forum.dlang.org/thread/mailman.259.1357667544.22503.digitalmars-d@puremagic.com
http://forum.dlang.org/thread/mailman.1017.1370879340.13711.digitalmars-d@puremagic.com
http://forum.dlang.org/thread/mailman.1157.1371497540.13711.digitalmars-d@puremagic.com
http://forum.dlang.org/thread/mailman.1209.1371565034.13711.digitalmars-d@puremagic.com
http://forum.dlang.org/thread/mailman.443.1377369357.1719.digitalmars-d@puremagic.com
http://forum.dlang.org/thread/5218FD04.8040404@webdrake.net

The bottom line is that implementing your RNGs as classes automatically gets you out of the worst of these traps by giving you reference semantics from the get-go.  Whether there are other problems that arise from this that make you prefer another design is a question you'll have to answer for yourself -- someone may yet come up with an objection that shows my current design is a Very Bad Idea ;-)

Anyway, the example with rndGen.take(10).writeln that monarch_dodra gave is probably the best argument one can make.  Imagine a cryptographic application where you're generating (supposedly) two different sets of random data, and because of an unintended value-type copy like this they turn out to be identical.  Insecure much? :-)
March 20, 2014
On Thursday, 20 March 2014 at 08:30:09 UTC, ponce wrote:
> Related: please consider using parts of SimpleRNG the excellent work of John D. Cook which provides many random distributions in a compact and documented way.
>
> https://github.com/p0nce/gfm/blob/master/math/gfm/math/simplerng.d (here a port)

Good call, I'll take a close look at that.  Can you provide me with a link to the original project too?  (Yes, I can just Google it, I'm being lazy:-)
March 20, 2014
On Thursday, 20 March 2014 at 21:16:27 UTC, Joseph Rushton
Wakeling wrote:
> I think there's a good case for a std.random2.crypto module that contains RNGs that are specifically suitable for cryptography.  That said I think the bar here has to be set VERY high, which is why I didn't even begin working on it yet.  It has been argued by some that where crypto in Phobos is concerned, we shouldn't take community contributions but we should hire security experts to write the functionality for us.

To be certain that the implementation doesn't have any security
holes?
March 20, 2014
On Thursday, 20 March 2014 at 08:51:08 UTC, monarch_dodra wrote:
> Agreed.

There is consensus it seems.  I will make the fix ;-)

> I think there is 0 doubt that reference semantics is the way to go. An advantage of using class is that it is still *possible* to place them on the stack with Scoped, or with some future language mechanic. On the other hand, if we implement as reference structs, then that's that.

I suppose the one concern I have is whether these reference-type RNGs might generate unpleasant unintended effects with other range objects in Phobos.  One thing that I really must do now that the basic design is in place is to systematically go through all the different ways in which these ranges could interact with deterministic ranges, and whether there are any issues to address.

> Furthermore, even in terms of performance, I think a heap allocated PRNG will still flat-out beat the value based one, if only because of the size of the damn thing.

I don't know if you or anyone else has run the simple benchmark programs I created, but my impression was that for the RNGs and other functions here there is no significant speed difference between the std.random2 class implementations and their std.random struct predecessors.  Where there _is_ a difference it seems more likely to be down to algorithm rather than class/struct or heap/stack.

For example, my new Mersenne Twister is slightly slower, but probably because it's carrying extra parameters compared to that of std.random.  On the other hand, generating random numbers by foreach'ing over uniform() calls does not seem to have any speed difference with popFrontN()'ing over a Uniform Distribution.

> That said, being able to allocate them on the malloc heap, and not the GC heap, would be (IMO) also a valid design.
>
> A simple and dumb design might be to still implement them with value semantic but:
> 1. Disable postblit.
> 2. Make .save() return a "Random*"
> This would mean
> 1. No dangers of accidental copy.
> 2. Range* is a ForwardRange.
> 3. Trivially allows GC/malloc/stack allocation.
> With good aliases ("alias Random = RadomImpl*;"), and a "make!" template we could make the "default useage" transparent to this mechanism yet make it easy to get our hands under the hood.

One strict objection here: .save returning a Random* would mean that this kind of unittest will fail, no?

    auto rng1 = someRandomGenType;
    auto rng2 = rng1.save;
    rng1.popFrontN(10);
    rng2.popFrontN(10);
    assert(rng1.front == rng2.front);

More generally, I think that, while I don't object to doing complicated stuff behind the scenes to get things simple and easy for the user, the problem I have with the above is that it really seems to require so much effort to create something which comes naturally with the current std.random2 design.

> I didn't check the code yet, but a "middle ground" could be to make all constructors private, and disable T.init. Then, we force construction through a make! template.
>
> This might not be what's most convenient, but it would allow us to potentially change the design at a later stage, without breaking user code.

The idea of making constructors private and forcing the user to use the convenience functions is a very interesting one.  As long as they provide an adequate interface to completely control all implementation parameters, it could provide a way to have significant leeway in controlling exactly how RNG instances are initialized.

On the other hand it really feels obnoxious to cut users off from being able to use objects directly :-(

>> Do you have a simple but very fast function that generates uniforms in [0.0, 1.0]? :-)
>
> AFAIK, the allocation issue is only for ranges? "uniform" is just a function, I don't think it affected by the issue. Even if you are operating on a "passed range", either ranges are reference semantics, and you take by value, or they are value semantic, and you take by ref. Either way, you have to pay for the indirection.

I think the issue here is just that it's possible to implement a really fast high-quality algorithm for uniformly-distributed floating point numbers in [0, 1).  That has all sorts of uses not just for Phobos users but also internally in e.g. random distributions (for example, it'll give a significant speed boost to NormalDistribution).
March 20, 2014
On Thursday, 20 March 2014 at 21:42:13 UTC, Chris Williams wrote:
> To be certain that the implementation doesn't have any security
> holes?

Yes.  Of course, in the current climate one might fear that
they'd be the ones introducing them ... :-)
March 20, 2014
On Thursday, 20 March 2014 at 18:43:49 UTC, Chris Williams wrote:
> That's only completely true if structs are referred to by pointer. ref parameters/returns aren't quite sufficient to keep a struct acting as a reference for all purposes.

As far as I can tell, you're thinking of _passing_ struct parameters, and here, indeed, passing by ref is sufficient.

The problem comes when you want to _store_ them.  It's not safe to just store a pointer, because the (value type) struct that's being pointed to might go out of scope and be deleted.

However, you can make structs behave like reference types behave like reference types, simply by making them contain (safe) references to the actual data they contain.  E.g. (stupidly simple example):

    struct Foo
    {
      private:
        int *_a;

      public:
        this(int val)
        {
            _a = new int;
            *_a = val;
        }

        ref int a() @property
        {
            return *_a;
        }
    }

    unittest
    {
        auto foo1 = Foo(23);
        auto foo2 = foo1;

        foo2.a = 4;

        writeln(foo1.a);
    }

Most of the discussion over RNGs in the last year is about whether we need to take a (more sophisticated) variant of this kind of approach to implement reference-type semantics for RNGs, or whether we should do something different.  std.random2 is ... something different ;-)
March 21, 2014
On Thursday, 20 March 2014 at 21:17:33 UTC, Joseph Rushton Wakeling wrote:
> On Thursday, 20 March 2014 at 08:30:09 UTC, ponce wrote:
>> Related: please consider using parts of SimpleRNG the excellent work of John D. Cook which provides many random distributions in a compact and documented way.
>>
>> https://github.com/p0nce/gfm/blob/master/math/gfm/math/simplerng.d (here a port)
>
> Good call, I'll take a close look at that.  Can you provide me with a link to the original project too?  (Yes, I can just Google it, I'm being lazy:-)

http://www.johndcook.com/SimpleRNG.h
http://www.johndcook.com/SimpleRNG.cpp

You will find that there is no license information.
But the author intended this as public domain, he will confirm if send an e-mail.
March 21, 2014
On Friday, 21 March 2014 at 16:01:28 UTC, ponce wrote:
> On Thursday, 20 March 2014 at 21:17:33 UTC, Joseph Rushton Wakeling wrote:
>> On Thursday, 20 March 2014 at 08:30:09 UTC, ponce wrote:
>>> Related: please consider using parts of SimpleRNG the excellent work of John D. Cook which provides many random distributions in a compact and documented way.
>>>
>>> https://github.com/p0nce/gfm/blob/master/math/gfm/math/simplerng.d (here a port)
>>
>> Good call, I'll take a close look at that.  Can you provide me with a link to the original project too?  (Yes, I can just Google it, I'm being lazy:-)
>
> http://www.johndcook.com/SimpleRNG.h
> http://www.johndcook.com/SimpleRNG.cpp
>
> You will find that there is no license information.
> But the author intended this as public domain, he will confirm if send an e-mail.

Hey he uses MWC algorithm! (but not the improved CMWC)

March 21, 2014
On Thursday, 20 March 2014 at 00:39:43 UTC, bearophile wrote:
> It's the best chance to improve naming, so do not throw it away for nothing:
> https://d.puremagic.com/issues/show_bug.cgi?id=9106

I think the following patch should fix that for you:
https://github.com/WebDrake/std.random2/commit/fb5429de77b3c1f7fe3968fd0bd92209c9021f31

I've also made shuffle composable as per your request.  Looks good? :-)
March 21, 2014
Joseph Rushton Wakeling:

> I think the following patch should fix that for you:
> https://github.com/WebDrake/std.random2/commit/fb5429de77b3c1f7fe3968fd0bd92209c9021f31
>
> I've also made shuffle composable as per your request.  Looks good? :-)

Seems good. Onward! :-)

Bye,
bearophile