Jump to page: 1 2 3
Thread overview
Dconf 2015 talks...
Jan 25, 2016
Era Scarecrow
Jan 25, 2016
Dicebot
Jan 25, 2016
Era Scarecrow
Jan 25, 2016
Jonathan M Davis
Jan 25, 2016
Era Scarecrow
Jan 25, 2016
Jonathan M Davis
Jan 25, 2016
Era Scarecrow
Jan 25, 2016
Era Scarecrow
Feb 08, 2016
Era Scarecrow
Feb 08, 2016
John Colvin
January 25, 2016
 Finally getting around to watching all the talks, all good stuff :) Figured I'd make a thread talking about things I came across and perhaps get into a discussion on them in detail.

 Anyways I'm watching the talk with Joseph Wakeling involving RNG and I wonder, is that still a problem or not?

 I ask since the simplest and most correct thing I can think of is to use the heap for a referenced state, and dup would then duplicate the state properly while otherwise the RNG problems go away that most of the talk went on about.

 It's possible I've been out of the loop and this has already been solved. Not sure yet, I haven't looked closely at any of the sources or the language updates.

 So... time for some rusty D prototyping.

[code]
struct RNG {
  int* state;

  this(int seed) {
    state = new int;
    *state = seed;
  }

  //if you provide a pointer, we don't rely on the heap/gc at all...
  //maybe a small stack based allocator could be useful for this?
  this(int* seed) { state = seed; }

  //dup instead of save, so it's not a forward range
  RNG dup() { return RNG(*state); }

  //popfront, front & empty as expected
}
[/code]

January 25, 2016
Main problem is with both making it @safe and avoid unforced allocations (i.e. allowing shared state to be kept on stack of `main`).
January 25, 2016
On Monday, 25 January 2016 at 13:05:46 UTC, Era Scarecrow wrote:
>  Finally getting around to watching all the talks, all good stuff :) Figured I'd make a thread talking about things I came across and perhaps get into a discussion on them in detail.
>
>  Anyways I'm watching the talk with Joseph Wakeling involving RNG and I wonder, is that still a problem or not?

It's still an issue, at least partly because amid all other things in life, I haven't had much opportunity to sit down and focus on it :-(

>  I ask since the simplest and most correct thing I can think of is to use the heap for a referenced state, and dup would then duplicate the state properly while otherwise the RNG problems go away that most of the talk went on about.
>
>  It's possible I've been out of the loop and this has already been solved. Not sure yet, I haven't looked closely at any of the sources or the language updates.
>
>  So... time for some rusty D prototyping.
>
> [code]
> struct RNG {
>   int* state;
>
>   this(int seed) {
>     state = new int;
>     *state = seed;
>   }
>
>   //if you provide a pointer, we don't rely on the heap/gc at all...
>   //maybe a small stack based allocator could be useful for this?
>   this(int* seed) { state = seed; }
>
>   //dup instead of save, so it's not a forward range
>   RNG dup() { return RNG(*state); }
>
>   //popfront, front & empty as expected
> }
> [/code]

The major problems are not so much with RNGs per se (note that your approach can actually be much more nicely implemented as a final class).  It's all the functionality that _wraps_ RNGs, such as random algorithms like randomCover and randomSample, or a D equivalent of C++'s random distributions.  These need to also be reference types (for their internal state as well as the RNG they wrap), and here it gets very tricky to avoid nasty situations with lots of allocations and frees (when ideally, you'd like stuff like this to be stack only).

Don't even ask about how complicated a `.dup` method gets when you start trying to extend to such entities. :-(
January 25, 2016
On Monday, 25 January 2016 at 14:31:12 UTC, Joseph Rushton Wakeling wrote:
> It's all the functionality that _wraps_ RNGs, such as random algorithms like randomCover and randomSample, or a D equivalent of C++'s random distributions.  These need to also be reference types (for their internal state as well as the RNG they wrap), and here it gets very tricky to avoid nasty situations with lots of allocations and frees (when ideally, you'd like stuff like this to be stack only).

 Hmm i wonder... If recognizes it as infinite, could it avoid treating them as forward ranges? As a struct it still wouldn't work, but as a class/reference type it would work then...
January 25, 2016
On Monday, 25 January 2016 at 15:38:45 UTC, Era Scarecrow wrote:
>  Hmm i wonder... If recognizes it as infinite, could it avoid treating them as forward ranges? As a struct it still wouldn't work, but as a class/reference type it would work then...

They shouldn't be forward ranges anyway, because if their content is randomly generated then it's not legit for them to implement the .save property.  The whole implementation of stuff in std.random via forward ranges is a massive design mistake.

Implementing the random algorithms/other wrappers as a class is problematic because then you get into the hassle of potentially having to new/free a lot of individual heap objects deep in the inner loop of your program.  I already tried this in hap.random, and came to the conclusion that it wasn't a valid approach.
January 25, 2016
BTW, I apologize for the rather terse replies so far; busy day :-)  I'll try and write out a more complete summary of the problem at some point in the near future (though it might have to wait 'til the weekend).

Thanks for the interest in contributing to solving this problem :-)
January 25, 2016
On Monday, 25 January 2016 at 17:19:05 UTC, Joseph Rushton Wakeling wrote:
> On Monday, 25 January 2016 at 15:38:45 UTC, Era Scarecrow wrote:
>>  Hmm i wonder... If recognizes it as infinite, could it avoid treating them as forward ranges? As a struct it still wouldn't work, but as a class/reference type it would work then...
>
> They shouldn't be forward ranges anyway, because if their content is randomly generated then it's not legit for them to implement the .save property.  The whole implementation of stuff in std.random via forward ranges is a massive design mistake.

As long as the numbers are pseudo-random, then in theory, there's no problem with a range of random numbers being a forward range. That could be useful upon occasion (similar to how it can be useful that the same seed results in the same sequence of numbers). The problem is that if they're a forward range, then you tend to get code that accidentally ends up reusing numbers in the sequence and that definitely is a problem. So ultimately, they probably should be input ranges rather than forward ranges, but I think that it's more of an issue with human error than with what makes sense from a technical perspective.

Regardless, non-pseudo random number generators obviously can't be forward ranges though, since their state isn't savable or repeatable.

- Jonathan M Davis

January 25, 2016
On Monday, 25 January 2016 at 17:19:05 UTC, Joseph Rushton Wakeling wrote:
> Implementing the random algorithms/other wrappers as a class is problematic because then you get into the hassle of potentially having to new/free a lot of individual heap objects deep in the inner loop of your program.  I already tried this in hap.random, and came to the conclusion that it wasn't a valid approach.

 What about an alternative allocator? Specifically I'm thinking in C's equivalent which is alloca (allocates directly on the stack with the rest of the variables). If the constructor is a function then that won't work; but if it's inlined then it should work.

 I suppose the alternate is an option to skip/throw away some numbers that should've been consumed (assuming you want to keep using the same seed), or seeding each per use.
January 25, 2016
On Monday, 25 January 2016 at 18:40:59 UTC, Jonathan M Davis wrote:
> As long as the numbers are pseudo-random, then in theory, there's no problem with a range of random numbers being a forward range.

I thought that too, for a long time, but I came to the conclusion it's not the case.

Here's the essence of the problem: a forward range is a promise to the code that uses it, "I am a deterministic sequence."  But the whole point of pseudo-random sequences is that you're not supposed to be able to tell them from actual randomness.

If you _tell_ downstream code "I am deterministic", that code may make assumptions about how it can use you, that are valid for "normal" deterministic sequences, but aren't valid for what are supposedly random sequences.

Consider the typical handling of forward ranges:

    auto foo (R) (R range)
        if (isForwardRange!R)
    {
        auto rcopy = range.save;

        // do something with rcopy
    }

That's fine if you're dealing with something whose behaviour is meant to be deterministic, but if you call this with a pseudo-random sequence, than every time you call it, you will get the same result.  It won't matter if what you pass is a reference-type -- the .save (which is correct behaviour for handling a forward range) means you're just repeatedly using a copy of the random sequence.

> That could be useful upon occasion (similar to how it can be useful that the same seed results in the same sequence of numbers).

Right, but the point is that re-use of a pseudo-random sequence should happen at programmer request only, not under the hood in library code they call -- which is what unfortunately happens if you implement RNGs and other random sequences as forward ranges :-(


> The problem is that if they're a forward range, then you tend to get code that accidentally ends up reusing numbers in the sequence and that definitely is a problem. So ultimately, they probably should be input ranges rather than forward ranges, but I think that it's more of an issue with human error than with what makes sense from a technical perspective.

Again, I thought that too, but I came to the conclusion that I'd been wrong.  It's not about fallible humans, it's about the promise a forward range makes.

Superficially, it looks like that promise is: "You can use my deterministic nature if you need to."  But actually, I think it is really something stronger: "You can use my deterministic nature freely and nothing will go wrong in doing so."

That's a much stronger promise than can be allowed for a pseudo-RNG, and I think it's borne out in the way in which e.g. phobos functionality makes use of forward ranges.

> Regardless, non-pseudo random number generators obviously can't be forward ranges though, since their state isn't savable or repeatable.

Right -- non-PRNGs must be input ranges by design.  I came to the conclusion that pseudo-RNGs need to be input ranges, but that implement an alternative to .save: a .dup method whose promise is, "You can duplicate the state and hence behaviour of this range, but you can't make any assumptions that this is a safe or sane thing to do."
January 25, 2016
On Monday, 25 January 2016 at 20:26:12 UTC, Era Scarecrow wrote:
>  What about an alternative allocator? Specifically I'm thinking in C's equivalent which is alloca (allocates directly on the stack with the rest of the variables). If the constructor is a function then that won't work; but if it's inlined then it should work.

I have been wondering about how allocators could help to deal with these problems.  Could you put forward a minimal example of how you would see it working?

>  I suppose the alternate is an option to skip/throw away some numbers that should've been consumed (assuming you want to keep using the same seed), or seeding each per use.

I'm not sure I follow what you mean here or why you think this would work?  Could you give a concrete example?
« First   ‹ Prev
1 2 3