Thread overview
[Issue 7067] std.random.RandomSample and RandomCover are poorly designed
Feb 28, 2015
Martin Nowak
Mar 02, 2015
Martin Nowak
Mar 13, 2015
Ivan Kazmenko
Aug 14, 2018
Nathan S.
Dec 17, 2022
Iain Buclaw
February 28, 2015
https://issues.dlang.org/show_bug.cgi?id=7067

Martin Nowak <code@dawg.eu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |code@dawg.eu

--- Comment #19 from Martin Nowak <code@dawg.eu> ---
We had a talk about this during the 2nd D meetup in Berlin.
One remaining question was how to deal with memory management when using a
reference type RNG.

I came up with a different solution and less invasive solution.

We simply disable implicit copying of RNGs (@disable this(this)) and ask people
to use explicitly use .save when copying is indeed intended.

We can even deprecate postblit like so.

    deprecated("Please use .save to explicitly copy RNGs.")
    this(this) {}

Internally we'd need to change all copying to use std.algorithm.move.

If reference semantic is wanted people can use either
    new Random(unpredictableSeed)
or
    refCounted(rnd)
    refCounted(Random(unpredictableSeed))
or (unsafe)
    refRange(&rnd)
.

This should also work for combined random generators like.

new Exponential(Random(unpredictableSeed))

refCounted(Exponential(Random(unpredictableSeed)))

auto rng = Exponential(Random(unpredictableSeed));
refRange(&rng);

--
February 28, 2015
https://issues.dlang.org/show_bug.cgi?id=7067

--- Comment #20 from Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> ---
(In reply to Martin Nowak from comment #19)
> We had a talk about this during the 2nd D meetup in Berlin.
> One remaining question was how to deal with memory management when using a
> reference type RNG.
> 
> I came up with a different solution and less invasive solution.
> 
> We simply disable implicit copying of RNGs (@disable this(this)) and ask
> people to use explicitly use .save when copying is indeed intended.
> 
> We can even deprecate postblit like so.
> 
>     deprecated("Please use .save to explicitly copy RNGs.")
>     this(this) {}
> 
> Internally we'd need to change all copying to use std.algorithm.move.

Interesting thought.  We'd have to take care that phobos-side things don't get immediately "fixed" by people just using .save inside RandomCover and RandomSample constructors.


> If reference semantic is wanted people can use either
>     new Random(unpredictableSeed)
> or
>     refCounted(rnd)
>     refCounted(Random(unpredictableSeed))
> or (unsafe)
>     refRange(&rnd)
> .
> 
> This should also work for combined random generators like.
> 
> new Exponential(Random(unpredictableSeed))
> 
> refCounted(Exponential(Random(unpredictableSeed)))
> 
> auto rng = Exponential(Random(unpredictableSeed));
> refRange(&rng);

I guess what I don't like about this solution is that it requires the user to take responsibility for generating the RNG (or wrapper-of-RNG) as a reference type, when that ought to be the default behaviour.  But OTOH we could take advantage of the fact that RNG structs are templated, and rework their convenience aliases, e.g.:

    alias Mt19937 = RefCounted!(MersenneTwisterEngine!(uint, 32LU, 624LU,
397LU, 31LU, 2567483615u, 11LU, 7LU, 2636928640u, 15LU, 4022730752u, 18LU))

... which would still leave the underlying struct available for use by someone who really wants to use it directly in that form.

--
March 02, 2015
https://issues.dlang.org/show_bug.cgi?id=7067

--- Comment #21 from Martin Nowak <code@dawg.eu> ---
(In reply to Joseph Rushton Wakeling from comment #20)
> I guess what I don't like about this solution is that it requires the user to take responsibility for generating the RNG (or wrapper-of-RNG) as a reference type, when that ought to be the default behaviour.  But OTOH we could take advantage of the fact that RNG structs are templated, and rework their convenience aliases, e.g.:
> 
>     alias Mt19937 = RefCounted!(MersenneTwisterEngine!(uint, 32LU, 624LU,
> 397LU, 31LU, 2567483615u, 11LU, 7LU, 2636928640u, 15LU, 4022730752u, 18LU))
> 
> ... which would still leave the underlying struct available for use by someone who really wants to use it directly in that form.

I don't think that sharing rng state among multiple consumers is a good idea.
Of course it looks like you want random numbers and shouldn't care about the
order, but often fixed seeds are used to get a reproducible pseudo-random
range.
When you share such a RNG as in
    auto rng = Random(fixedSeed);
    assert(!equal(randomCover(a, rng), randomCover(a, rng)));
the sequences depends on the implementation of equal and randomCover.
Would be cleaner to use 2 RNGs with independent state here, so the fact that a
shared state requires to be explicit is a good thing. Also we'd set a bad
precedent by making one of std's ranges a ref type. Ref ranges have a lot of
subtleties and aren't that well tested with std.algorithm.

--
March 03, 2015
https://issues.dlang.org/show_bug.cgi?id=7067

--- Comment #22 from jens.k.mueller@gmx.de ---
We should try out Martin's idea.
I'll do it but I'd like to write some tests first. Joseph you mentioned several
"suprizes" with the current design. I'd like to create some tests for those.
Can you give some code for the flaws in the design?

I'm also looking at http://en.cppreference.com/w/cpp/numeric/random
But adding more generators (including abstractions for non-deterministic) and
distributions can be done after this issue has been addressed.

--
March 12, 2015
https://issues.dlang.org/show_bug.cgi?id=7067

--- Comment #23 from Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> ---
@Martin @Jens: sorry for radio silence on this.  It's a busy period, and I recently moved to a new apartment where I still don't have home internet.

> I don't think that sharing rng state among multiple consumers is a good idea. Of course it looks like you want random numbers and shouldn't care about the order, but often fixed seeds are used to get a reproducible pseudo-random range.
>
> When you share such a RNG as in
>    auto rng = Random(fixedSeed);
>    assert(!equal(randomCover(a, rng), randomCover(a, rng)));
> the sequences depends on the implementation of equal and randomCover.

Yes, a case like this obviously creates complications.  But I don't think subtleties like this should prevent a user from instantiating one RNG instance and using it with multiple consumers.  The "default" case (where no RNG instance is passed) already employs a common RNG instance between different consumers; I'd rather have consistency of behaviour whether an RNG is explicitly provided or not.

Of course, we can advise associating different RNG instances with different consumers (or making sure that a consumer does its work before associating an RNG with a new consumer), but that's different from compelling the user to follow this pattern.

Note also that one-RNG-per-consumer really doesn't scale to e.g. a simulation where you are generating thousands or millions of random samples.  The risk of the different RNG seeds generating correlated sequences seems greater than the risks associated with using one RNG underlying all sample instances.

> Also we'd set a bad precedent by making one of std's ranges a ref type. Ref ranges have a lot of subtleties and aren't that well tested with std.algorithm.

I agree there are a lot of subtleties, but my feeling is that we need to embrace and explore those in order to identify what the best way forward is for RNGs.

--
March 13, 2015
https://issues.dlang.org/show_bug.cgi?id=7067

Ivan Kazmenko <gassa@mail.ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |gassa@mail.ru

--
October 15, 2016
https://issues.dlang.org/show_bug.cgi?id=7067

Andrei Alexandrescu <andrei@erdani.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |bootcamp

--
August 14, 2018
https://issues.dlang.org/show_bug.cgi?id=7067

Nathan S. <n8sh.secondary@hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://issues.dlang.org/sh
                   |                            |ow_bug.cgi?id=19168

--
December 17, 2022
https://issues.dlang.org/show_bug.cgi?id=7067

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P2                          |P3

--