Jump to page: 1 2
Thread overview
[Issue 7067] New: std.random.RandomSample and RandomCover are poorly designed
Dec 05, 2011
Vladimir Panteleev
Dec 06, 2011
Vladimir Panteleev
Dec 08, 2011
Bernard Helyer
Jun 15, 2012
Jonathan M Davis
Jun 19, 2012
Dmitry Olshansky
December 05, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067

           Summary: std.random.RandomSample and RandomCover are poorly
                    designed
           Product: D
           Version: D2
          Platform: Other
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Phobos
        AssignedTo: nobody@puremagic.com
        ReportedBy: thecybershadow@gmail.com
                CC: andrei@metalanguage.com


--- Comment #0 from Vladimir Panteleev <thecybershadow@gmail.com> 2011-12-05 04:00:22 PST ---
The following tests will always fail:

    int[] a = [ 0, 1, 2, 3, 4, 5, 6, 7, 8 ];
    assert(!equal(randomCover(a, rndGen()), randomCover(a, rndGen())));
    assert(!equal(randomSample(a, 5, rndGen()), randomSample(a, 5, rndGen())));

The reason why these tests will fail is that both functions take the RNG by value. Not only is this unintuitive, this is also a security liability - someone depending on these functions to generate random identifiers can be surprised that two successive calls generate the same ID.

I strongly suggest that RNG types are disallowed from being implicitly copied. Even though pseudo-random number generators shouldn't be used for security purposes, they're still likely to be used in such contexts - especially considering lack of better sources of random data in Phobos.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 05, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067


bearophile_hugs@eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs@eml.cc


--- Comment #1 from bearophile_hugs@eml.cc 2011-12-05 04:09:15 PST ---
See also issue 6593

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067



--- Comment #2 from Andrei Alexandrescu <andrei@metalanguage.com> 2011-12-05 21:28:32 PST ---
(In reply to comment #0)
> The following tests will always fail:
> 
>     int[] a = [ 0, 1, 2, 3, 4, 5, 6, 7, 8 ];
>     assert(!equal(randomCover(a, rndGen()), randomCover(a, rndGen())));
>     assert(!equal(randomSample(a, 5, rndGen()), randomSample(a, 5, rndGen())));

Good point.

> The reason why these tests will fail is that both functions take the RNG by value.
>
> Not only is this unintuitive, this is also a security liability -
> someone depending on these functions to generate random identifiers can be
> surprised that two successive calls generate the same ID.

I think we can safely eliminate this argument from the discussion.

> I strongly suggest that RNG types are disallowed from being implicitly copied. Even though pseudo-random number generators shouldn't be used for security purposes, they're still likely to be used in such contexts - especially considering lack of better sources of random data in Phobos.

The problem with taking a random generator by reference is that it then needs to be escaped. So people would be quite surprised to see that:

auto gen = rndGen;
return randomSample(a, 5, gen);

has undefined behavior.

One way or another we need to solve this, e.g. by creating a wrapper with reference semantics over generators. Ideas are welcome.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067



--- Comment #3 from bearophile_hugs@eml.cc 2011-12-05 23:53:48 PST ---
(In reply to comment #2)

> The problem with taking a random generator by reference is that it then needs to be escaped. So people would be quite surprised to see that:
> 
> auto gen = rndGen;
> return randomSample(a, 5, gen);
> 
> has undefined behavior.
> 
> One way or another we need to solve this, e.g. by creating a wrapper with reference semantics over generators. Ideas are welcome.

Turn random generators into final classes?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067



--- Comment #4 from Andrei Alexandrescu <andrei@metalanguage.com> 2011-12-06 07:53:28 PST ---
> Turn random generators into final classes?

We have backward compatibility to worry about.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067



--- Comment #5 from Vladimir Panteleev <thecybershadow@gmail.com> 2011-12-06 08:04:49 PST ---
The disadvantages of breaking backwards compatibility need to be considered on a case-by-case basis. I think that turning RNGs into reference types has the potential to be a relatively low-impact change, while also having a good chance of revealing broken code.

The typical usage of std.random does not involve using the RNG types directly, and rather using the implicit thread-local RNG.

An example of affected code:

Random r;
// ... use r

I think that generally allowing such code is a mistake, because it's not clear that the RNG is not seeded.

auto r = Random(42);

We can implement this for backwards compatibility using static opCall (which shall be scheduled for deprecation).

The biggest problem is intentional usage of value semantics (it would transparently turn into reference semantics). Perhaps there's something we could do with the help of opAssign?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 06, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067



--- Comment #6 from bearophile_hugs@eml.cc 2011-12-06 09:49:20 PST ---
(In reply to comment #5)

> The biggest problem is intentional usage of value semantics (it would transparently turn into reference semantics).

I suggest to ignore such cases, they are probably uncommon, and add a warning note to the changelog.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 07, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067


Alex Rønne Petersen <xtzgzorex@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xtzgzorex@gmail.com


--- Comment #7 from Alex Rønne Petersen <xtzgzorex@gmail.com> 2011-12-07 04:21:37 PST ---
(In reply to comment #4)
> > Turn random generators into final classes?
> 
> We have backward compatibility to worry about.

Sometimes breaking changes can be a good thing. It seems to me like with the current design, it's more likely that people are Doing It Wrong than the opposite. Keeping backwards compatibility also means allowing people to continue doing the same mistakes.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
December 08, 2011
http://d.puremagic.com/issues/show_bug.cgi?id=7067


Bernard Helyer <blood.of.life@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |blood.of.life@gmail.com


--- Comment #8 from Bernard Helyer <blood.of.life@gmail.com> 2011-12-08 00:02:29 PST ---
To add an anecdote to this, when I first tried passing Random instances around, I was surprised to find they weren't reference types; it wasn't obvious to me at all.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
June 15, 2012
http://d.puremagic.com/issues/show_bug.cgi?id=7067


Jonathan M Davis <jmdavisProg@gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg@gmx.com


--- Comment #9 from Jonathan M Davis <jmdavisProg@gmx.com> 2012-06-15 01:02:22 PDT ---
We can make the random number generator ranges reference types without breaking code simply by creating inner structs for them which hold their member variables and sit on the heap. The only code which would break because of this would be code which relied on them being value types, which (as discussed) is almost certainly a bug.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
« First   ‹ Prev
1 2