Mir Random [WIP] (page 4)

On 11/23/2016 01:16 AM, Ilya Yaroshenko wrote: >> The std.range is good example of degradation, it has a lot of API and >> implementation bugs. >> > > EDIT: std.range -> std.random Oh, that narrows it down. Thx. -- Andrei

On 11/23/2016 05:27 AM, Joseph Rushton Wakeling wrote: > > Alternative solution: functionality that expects the full unsigned > integer word (UIntType) to be filled with random bits, should validate > that the min/max values of the generator correspond to UIntType.min and > UIntType.max. > > That introduces less breaking change, creates less divergence with the > C++11 standard, and preserves more flexibility for the future. Good idea - static assert is your friend. -- Andrei

November 23, 2016

Re: Mir Random [WIP]

Posted by Andrei Alexandrescu
in reply to Joseph Rushton Wakeling

Permalink

Andrei Alexandrescu

Posted in reply to Joseph Rushton Wakeling

Permalink

On 11/23/2016 05:47 AM, Joseph Rushton Wakeling wrote:
> On Wednesday, 23 November 2016 at 01:34:23 UTC, Andrei Alexandrescu wrote:
>> I'm unclear on what that statistically unsafe default behavior is - my
>> understanding is it has to do with RNGs being inadvertently copied. It
>> would be great to formalize that in a well-explained issue.
>
> I'll see if I can write that up in depth some time soon.  TBH though I
> think the problem is less about RNGs and more about stuff like
> RandomSample and RandomCover (and, in future, random distributions that
> have internal state, like a struct implementing a normal distribution
> using the Ziggurat algorithm internally).
>
> It's not so difficult to stop RNG state being copied inadvertently, but
> when you have ranges wrapping ranges wrapping ranges, each containing
> their own extra state that cannot be copied by value, things get a bit
> more complicated.

Well we need to get to the bottom of this if we're to make progress. Otherwise it's copypasta with little changes followed by disappointment all the way down. -- Andrei

On Wednesday, 23 November 2016 at 13:35:33 UTC, Ilya Yaroshenko wrote: > 1. Current default RNG (Mt19937) has not state for the latest value. That's a detail of your implementation, though, and it's not true for lots of other pseudo-RNGs. > 2. The structure is allocated on stack and compilers can recognise loop patterns and eliminate addition memory movements for many cases. Fair enough. I'm still not sure, though, why you would object to providing public methods for pseudo-RNGs to (i) update the internal state and (ii) access the most recently generated variate, which the `opCall` would then wrap.

On Wednesday, 23 November 2016 at 13:44:00 UTC, Andrei Alexandrescu wrote: > Well we need to get to the bottom of this if we're to make progress. Otherwise it's copypasta with little changes followed by disappointment all the way down. -- Andrei Would you be up for a direct discussion of this some time -- maybe next week once I've had time to re-review the fine details of my concerns? I can provide a summary of issues as a starting point for that discussion, if it would be useful. I suggest this because (i) I'm interested in your insight and (ii) even if we end up disagreeing on concerns and solutions, it would be good to make sure that we're on the same page in terms of understanding each other's point of view.

On 11/23/2016 06:14 AM, Ilya Yaroshenko wrote: > I think that Range API is bad and useless overengineering for RNGs. So it either "takes 1 minute" to change the interface from opCall to ranges (i.e. they're virtually the same thing), or it's the above. There's precious little overengineering that can be done in 1 minute :o). I see you did that in a dozen lines in RandomRangeAdaptor. I understand you believe the range interface is unnecessary or overkill for random number generators. I may even agree for certain cases. However, there are a few arguments you may want to consider: * By virtue of working with D, everybody know what a range is. Presenting the RNGs that way instantly evokes a host of knowledge, understanding, and idioms. * D users (or at least a fraction) and several algorithms understand the notion of an infinite range and make salient decisions based on that. A range interface automatically taps into that. Andrei

On 11/23/2016 08:26 AM, Joseph Rushton Wakeling wrote: > On Wednesday, 23 November 2016 at 13:03:04 UTC, Ilya Yaroshenko wrote: >> Added RandomRangeAdaptor for URBGs: >> https://github.com/libmir/mir-random/blob/master/source/random/algorithm.d >> > > This has exactly the problem I identified above, though: you're > unnecessarily cacheing the latest variate rather than just using the RNG > state directly. Not the biggest deal in the world, but avoidable if you > allow a separation between updating RNG state and accessing it. Well I do see another small problem with https://github.com/libmir/mir-random/blob/master/source/random/algorithm.d#L19. It creates the first value eagerly upon construction. That's just a bit awkward. It's not the first time I'm seeing this issue, and to be honest am a bit bummed about it. It's a consequence of all ranges needing to buffer at least one element. -- Andrei

On Wednesday, 23 November 2016 at 13:56:27 UTC, Andrei Alexandrescu wrote: > Well I do see another small problem with https://github.com/libmir/mir-random/blob/master/source/random/algorithm.d#L19. It creates the first value eagerly upon construction. That's just a bit awkward. It's not the first time I'm seeing this issue, and to be honest am a bit bummed about it. It's a consequence of all ranges needing to buffer at least one element. -- Andrei Yea, this touches on one of my more general concerns related to ranges, randomness and laziness. Part of the trouble is we don't have a really good general design for how to implement _truly_ lazy ranges, where the value of `front` is not determined until the point where you first access it. This has particular relevance to e.g. RandomSample and RandomCover.

On Wednesday, 23 November 2016 at 10:57:04 UTC, Joseph Rushton Wakeling wrote: > struct MyRNG > { > void popFront() { /* update internal state */ } > > UIntType front() @property { return /* whatever part of internal state */; } > > UIntType opCall() > { > this.popFront(); > return this.front; > } > } xorshift128+ returns a temporary value computed during state update, so it can't efficiently separate font from popFront. Also this API is prone to reuse some values and discard others, call popFront after front.

On Wednesday, 23 November 2016 at 01:34:23 UTC, Andrei Alexandrescu wrote: > We should add a reference RNG that transforms any other RNG into an input range that shares the actual RNG. Didn't we have a generic Ref!T wrapper in std.typecons that would work this way? Can't find it now.

Forums