June 18, 2013
On 06/18/2013 03:38 PM, H. S. Teoh wrote:
> Well, .save *is* used in a few places; the few that I know of include cartesianProduct (at least one of the ranges need to be iterated over repeatedly in order to compute the product), and find(), I believe, where the needle needs to be at least a forward range so that multiple candidate subranges can be checked repeatedly. And joiner forwards .save to its wrapper range, though it doesn't actually call it as part of its operation. There may be a few other places where .save is used.

Try:

    auto gen = new MtClass19937(1001);
    auto t1 = gen.take(2);
    auto t2 = gen.take(2);
    writeln(cartesianProduct(t1, t2));

    auto t3 = gen.take(2);
    auto t4 = gen.take(2);
    writeln(cartesianProduct(t3, t4));

The results are interesting.  t1 and t2 are evaluated to be equal to one another, but it looks like the original source range is consumed, as t3 and t4 are equal to one another but different from t1 and t2.

Now try:

    auto gen = new MtClass19937(1001);
    auto t1 = gen.take(2);
    auto t2 = gen.take(2);
    writeln(cartesianProduct(t1, iota(2)));
    writeln(cartesianProduct(t1, iota(2)));
    writeln(cartesianProduct(t1, iota(2)));
    writeln(cartesianProduct(t1, iota(2)));

... identical results all round.  The first argument is not consumed, cartesianProduct must iterate only over a saved copy.

NOW try:

    auto gen = new MtClass19937(1001);
    auto t1 = gen.take(2);
    auto t2 = gen.take(2);
    writeln(cartesianProduct(iota(2), t1));
    writeln(cartesianProduct(iota(2), t1));
    writeln(cartesianProduct(iota(2), t2));
    writeln(cartesianProduct(iota(2), t2));

... and the results keep changing.  The first argument to cartesianProduct is not consumed, the second is.

I feel very uncomfortable about the idea of cases like this where the consumption of the source of randomness is so unpredictable.
June 18, 2013
On 06/18/2013 04:26 PM, Joseph Rushton Wakeling wrote:
> ... and the results keep changing.  The first argument to cartesianProduct is not consumed, the second is.

I guess it's down to this part of cartesianProduct:

    static if (isInfinite!R1 && isInfinite!R2)
    {
        ...
    }
    else static if (isInputRange!R2 && isForwardRange!R1 && !isInfinite!R1)
    {
        return joiner(map!((ElementType!R2 a) => zip(range1.save, repeat(a)))
                          (range2));
    }
    else static if (isInputRange!R1 && isForwardRange!R2 && !isInfinite!R2)
    {
        return joiner(map!((ElementType!R1 a) => zip(repeat(a), range2.save))
                          (range1));
    }

In the case where both R1 and R2 are finite forward ranges, cartesianProduct will save range1 and not range2.

I accept that this may not be an area where random input is expected, but it still feels "ouch" and is the kind of unexpected correlation problem I had anticipated.

If we remove the .save property from MtClass19937 then the problem vanishes, but it is a shame to have to lose .save for pseudo-random number generators.
June 18, 2013
On Tue, Jun 18, 2013 at 04:50:07PM +0200, monarch_dodra wrote:
> On Tuesday, 18 June 2013 at 14:18:19 UTC, H. S. Teoh wrote:
> >[...]
> >
> >Just make RNGs passed by reference, and the above problem will not happen.
> >
> >
> >T
> 
> Well... it's more than *just* pass-by-reference, its actual store-by-reference (eg "reference semantic").

Good point.


> In the case of "take", take store a copy of the passed range, so a pass-by-reference wouldn't really change much.

If the RNG were a reference type like a class, then the wrapper range Take would simply have another reference to the class instance, so the problem is avoided. But if it were only passed by reference but is a value type (e.g. modify take() to have a ref argument), then the problem will still persist, since the wrapper ctor will copy the RNG instance.


> I'm not sure if that's what you meant, but just in case. We don't want any ambiguity on the words we are using.

Thanks!


T

-- 
Государство делает вид, что платит нам зарплату, а мы делаем вид, что работаем.
June 18, 2013
On 6/18/13 11:11 AM, monarch_dodra wrote:
> On Tuesday, 18 June 2013 at 14:39:38 UTC, Andrei Alexandrescu wrote:
>> On 6/18/13 10:16 AM, H. S. Teoh wrote:
>>> I say again that RNGs being passed by value is a major BUG. The above
>>> situation is a prime example of this problem. We *need* to make RNGs
>>> passed by reference. For situations where you *want* to duplicate a
>>> pseudo random sequence, an explicit method should be provided to clone
>>> the RNG. Duplication of RNGs should never be implicit.
>>
>> Occasionally copying a RNG's state is useful, but I agree most of the
>> time you want to just take a reference to it.
>>
>> I think a good way toward a solution is
>> http://d.puremagic.com/issues/show_bug.cgi?id=10404.
>>
>>
>> Andrei
>
> That sounds nice, but is it possible? The only way to truly forward
> everything is via an alias this.

Nonono, that must use introspection and mixins, like WhiteHole and BlackHole do.

Andrei


June 18, 2013
On Tuesday, 18 June 2013 at 16:11:06 UTC, Andrei Alexandrescu wrote:
> On 6/18/13 11:11 AM, monarch_dodra wrote:
>> On Tuesday, 18 June 2013 at 14:39:38 UTC, Andrei Alexandrescu wrote:
>>> On 6/18/13 10:16 AM, H. S. Teoh wrote:
>>>> I say again that RNGs being passed by value is a major BUG. The above
>>>> situation is a prime example of this problem. We *need* to make RNGs
>>>> passed by reference. For situations where you *want* to duplicate a
>>>> pseudo random sequence, an explicit method should be provided to clone
>>>> the RNG. Duplication of RNGs should never be implicit.
>>>
>>> Occasionally copying a RNG's state is useful, but I agree most of the
>>> time you want to just take a reference to it.
>>>
>>> I think a good way toward a solution is
>>> http://d.puremagic.com/issues/show_bug.cgi?id=10404.
>>>
>>>
>>> Andrei
>>
>> That sounds nice, but is it possible? The only way to truly forward
>> everything is via an alias this.
>
> Nonono, that must use introspection and mixins, like WhiteHole and BlackHole do.
>
> Andrei

Hum... just looked at the code. I was hoping there was a way to do this without pulling out the heavy artillery...
June 18, 2013
On Tuesday, June 18, 2013 10:39:38 Andrei Alexandrescu wrote:
> On 6/18/13 10:16 AM, H. S. Teoh wrote:
> > I say again that RNGs being passed by value is a major BUG. The above situation is a prime example of this problem. We *need* to make RNGs passed by reference. For situations where you *want* to duplicate a pseudo random sequence, an explicit method should be provided to clone the RNG. Duplication of RNGs should never be implicit.
> 
> Occasionally copying a RNG's state is useful, but I agree most of the time you want to just take a reference to it.
> 
> I think a good way toward a solution is http://d.puremagic.com/issues/show_bug.cgi?id=10404.

That may be a good way to go about it, but the default such as what rndGen returns really should be a reference type, and only ranges whose state can actually be copied such that continuing to iterate over them gives the same values should be forward ranges IMHO. Using Class!T might be able to make it so that we can avoid needing a std.random2, but we'll probably still need some minor code breakage in there somewhere.

- Jonathan M Davis
June 18, 2013
On 6/18/13 2:11 PM, Jonathan M Davis wrote:
> On Tuesday, June 18, 2013 10:39:38 Andrei Alexandrescu wrote:
>> On 6/18/13 10:16 AM, H. S. Teoh wrote:
>>> I say again that RNGs being passed by value is a major BUG. The above
>>> situation is a prime example of this problem. We *need* to make RNGs
>>> passed by reference. For situations where you *want* to duplicate a
>>> pseudo random sequence, an explicit method should be provided to clone
>>> the RNG. Duplication of RNGs should never be implicit.
>>
>> Occasionally copying a RNG's state is useful, but I agree most of the
>> time you want to just take a reference to it.
>>
>> I think a good way toward a solution is
>> http://d.puremagic.com/issues/show_bug.cgi?id=10404.
>
> That may be a good way to go about it, but the default such as what rndGen
> returns really should be a reference type, and only ranges whose state can
> actually be copied such that continuing to iterate over them gives the same
> values should be forward ranges IMHO. Using Class!T might be able to make it
> so that we can avoid needing a std.random2, but we'll probably still need some
> minor code breakage in there somewhere.
>
> - Jonathan M Davis

How about defining rndGenRef and putting rndGen on the deprecation path?

Andrei
June 18, 2013
On Tuesday, 18 June 2013 at 19:21:23 UTC, Andrei Alexandrescu wrote:
> On 6/18/13 2:11 PM, Jonathan M Davis wrote:
>> On Tuesday, June 18, 2013 10:39:38 Andrei Alexandrescu wrote:
>>> On 6/18/13 10:16 AM, H. S. Teoh wrote:
>>>> I say again that RNGs being passed by value is a major BUG. The above
>>>> situation is a prime example of this problem. We *need* to make RNGs
>>>> passed by reference. For situations where you *want* to duplicate a
>>>> pseudo random sequence, an explicit method should be provided to clone
>>>> the RNG. Duplication of RNGs should never be implicit.
>>>
>>> Occasionally copying a RNG's state is useful, but I agree most of the
>>> time you want to just take a reference to it.
>>>
>>> I think a good way toward a solution is
>>> http://d.puremagic.com/issues/show_bug.cgi?id=10404.
>>
>> That may be a good way to go about it, but the default such as what rndGen
>> returns really should be a reference type, and only ranges whose state can
>> actually be copied such that continuing to iterate over them gives the same
>> values should be forward ranges IMHO. Using Class!T might be able to make it
>> so that we can avoid needing a std.random2, but we'll probably still need some
>> minor code breakage in there somewhere.
>>
>> - Jonathan M Davis
>
> How about defining rndGenRef and putting rndGen on the deprecation path?
>
> Andrei

Even with 10404, I still think we'd need a new module. An adaptor helps, but these ranges really need to be ref by default (amongst other changes). We'd probably have to end up with MersenneRef, as well as pretty much everything else...

Also: Class!PRNG.save would not actually save, it would merely return a copy of its PRNG payload, and not a new class instance. This could be problematic (for starters, it wouldn't qualify for forward range...)
June 18, 2013
On Tuesday, June 18, 2013 15:21:23 Andrei Alexandrescu wrote:
> How about defining rndGenRef and putting rndGen on the deprecation path?

We could, though that's a bit ugly as names go IMHO. We could almost get away with simply changing what rndGen returns, but unfortunately, it returns Random rather than auto, so if anyone typed it explicitly, we'd break their code - unless we changed Random to be aliased to Class!Mt19937 instead of Mt19937, but then anything using Random explictly without rndGen would break. My _guess_ would be that we could get away with just changing rndGen to return something else and break almost no one, but unfortunately we can't know that for sure. We probably should have had rndGen return auto rather than Random, but we have more experience with ranges now than when std.random was written.

Regardless, I think that we should be sure of what we want to do with the module as a whole before we start making tweaks like that, since if we decide that we really do want std.random2, there's not much point in tweaking std.random, and std.random2.rndGen could return what we want. If we do stick with std.random though, I think that we should come up with a better name than rndGenRef (maybe rndRange or genRnd or simply random).

- Jonathan M Davis
June 18, 2013
On 06/18/2013 08:35 PM, Jonathan M Davis wrote:
> Regardless, I think that we should be sure of what we want to do with the module as a whole before we start making tweaks like that, since if we decide that we really do want std.random2, there's not much point in tweaking std.random, and std.random2.rndGen could return what we want. If we do stick with std.random though, I think that we should come up with a better name than rndGenRef (maybe rndRange or genRnd or simply random).

If you're having a std.random2 then I don't see a reason to change the name -- keep it as rndGen.  In _most_ cases of downstream code it won't break, so it smooths the upgrade path.  I presume using std.random and std.random2 together would be considered an error, so is there really any risk of a clash?

Also, assuming we do go with std.random2 rather than an upgrade to std.random, it may be better to go with structs-carrying-a-pointer-to-a-payload rather than with classes; that in turn might allow for less breakage in users' code.

For example, if the Random alias is switched from a struct to a class, then users would have to go through their code tweaking stuff like this:

     - auto gen = Random(unpredictableSeed);
     + auto gen = new Random(unpredictableSeed);

Whereas if RNGs are still structs on the outside (with reference to payload inside) then the original line should still work.