Is continuously seeding a random number generator performance intensive? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Is continuously seeding a random number generator performance intensive?

Thread overview

Is continuously seeding a random number generator performance intensive?
Jan 02, 2014 Jeroen Bollen
Jan 02, 2014 monarch_dodra
Jan 03, 2014 Frustrated
Jan 03, 2014 Chris Cain
Jan 03, 2014 monarch_dodra
Jan 03, 2014 Jeroen Bollen
Jan 03, 2014 monarch_dodra
Jan 03, 2014 Jeroen Bollen
Jan 03, 2014 monarch_dodra
Jan 03, 2014 Jeroen Bollen
Jan 04, 2014 Jeroen Bollen
Jan 04, 2014 Jeroen Bollen
Jan 04, 2014 bearophile
Jan 04, 2014 Jeroen Bollen
Jan 03, 2014 Andrea Fontana
Jan 15, 2014 Frustrated
Jan 13, 2014 marcpmichel
Jan 15, 2014 Jeroen Bollen
Jan 15, 2014 Frustrated
Jan 16, 2014 Jeroen Bollen
Jan 17, 2014 Jeroen Bollen
Jan 21, 2014 Jeroen Bollen
Jan 21, 2014 monarch_dodra
Jan 21, 2014 Jeroen Bollen
Jan 21, 2014 Jeroen Bollen
Jan 30, 2014 monarch_dodra

January 02, 2014

Is continuously seeding a random number generator performance intensive?

Posted by Jeroen Bollen

Jeroen Bollen

D provides a set of Random Number Generators in std.random. I am writing an application which would create a 2D map of noise. To do this though, I'll have to calculate the same random numbers over and over again. (I cannot store them, that'd take a horrible amount of RAM. )

Is it good to re-seed a generator for every coordinate, will this be performance intensive? Is there maybe way to easily implement Generator.at(uint x) in D?

January 02, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by monarch_dodra
in reply to Jeroen Bollen

monarch_dodra

Posted in reply to Jeroen Bollen

On Thursday, 2 January 2014 at 20:38:10 UTC, Jeroen Bollen wrote:
> D provides a set of Random Number Generators in std.random. I am writing an application which would create a 2D map of noise. To do this though, I'll have to calculate the same random numbers over and over again. (I cannot store them, that'd take a horrible amount of RAM. )

AFAIK, the costliest part of seeding a generator is generating the "random seed". After that, once you have the seed, it shouldn't be too expensive. That said, it depends on each generator of course. In any case, performance wise, it shouldn't be too expensive (if you use the same seed).

That said, it sounds like what you want to do though is to simply "save" the range. PRNG's have the "save" primitive you can use. Just use that.

> Is it good to re-seed a generator for every coordinate, will this be performance intensive? Is there maybe way to easily implement Generator.at(uint x) in D?

*This* comment is confusing me. What do you mean by "re-seed"? You mean a random seed? Once seeded, you shouldn't have to re-seed a PRNG: It'll generate random numbers forever. Or do you mean "re-seed" in the sense "reset"? Because if you do that, then you'll have the same numbers for each coordinates...?

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by Frustrated
in reply to Jeroen Bollen

Frustrated

Posted in reply to Jeroen Bollen

On Thursday, 2 January 2014 at 20:38:10 UTC, Jeroen Bollen wrote:
> D provides a set of Random Number Generators in std.random. I am writing an application which would create a 2D map of noise. To do this though, I'll have to calculate the same random numbers over and over again. (I cannot store them, that'd take a horrible amount of RAM. )
>
> Is it good to re-seed a generator for every coordinate, will this be performance intensive? Is there maybe way to easily implement Generator.at(uint x) in D?

I believe you fail to understand how the RNG's work.

You supply a seed(a value) and they generate a deterministic sequence off that value that is pseudo-random relative to each other..

If you re-seed the generator every time you are not doing anything but wasting cycles since the new element will be random, but the same as using the next element in the sequence in the first case.

e.g.,

seed(k);
for(i = 1..10)
print(rnd(i));

and

for(i = 1..10)
{
seed(time);
print(rnd(i));
}

will both produce random sequences of numbers(and random sequences of numbers are "identically random".

The nice thing about the first case is that you can save the seed once time and produce the exact same sequence... which would save you memory. In the second case you would have to record every seed to recover the sequence.

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by Chris Cain
in reply to Frustrated

Chris Cain

Posted in reply to Frustrated

On Thursday, 2 January 2014 at 22:01:01 UTC, monarch_dodra wrote:
> *This* comment is confusing me. What do you mean by "re-seed"? You mean a random seed? Once seeded, you shouldn't have to re-seed a PRNG: It'll generate random numbers forever. Or do you mean "re-seed" in the sense "reset"? Because if you do that, then you'll have the same numbers for each coordinates...?

And...

On Friday, 3 January 2014 at 01:01:21 UTC, Frustrated wrote:
> If you re-seed the generator every time you are not doing anything but wasting cycles since the new element will be random, but the same as using the next element in the sequence in the first case.
>
> ...snip..
>
> The nice thing about the first case is that you can save the seed once time and produce the exact same sequence... which would save you memory. In the second case you would have to record every seed to recover the sequence.

I think you both misunderstand what he wants to do.

He's generating a 2D noise map, apparently of arbitrarily large size. Let's say his 2D map was 4 billion x 4 billion elements long and each element needs to have a range of 0-255 (so, 1 byte). Obviously storing such a large noise map in memory is not feasible.

But if you were able to instead take an x and y coordinate and regenerate the information when it becomes necessary, then storing all of those positions would be unnecessary. Instead of storing 16,000,000,000,000,000,000 bytes (completely infeasible), you could store the bounds of map (8 bytes) and generate the part of the map you need at any one moment in time (which may need perhaps a ~2000x2000 portion or 4MB at a time, depending on what he's doing).

So, it sounds like the OP is using the x and y coords for a seed to generate a single number and he was curious to whether it costs too much to reseed like this for every point.

To the OP:

FWIW, I'm under the impression that this is a fairly common strategy, but usually when I've seen this used more than one number is generated at a time. You can still do this, in this case. For example, divide x by 10 and generate 10 elements (column wise) in the noise map each time and it reduces the number of reseeds by a factor of 10. Some effort might be wasted, but as long as you need a decent chunk of the noise map at any one point in time, this should work out pretty well in practice.

My biggest concern with this approach is that you must take care that your usage of seeding with x and y coordinates does not cause repeated elements to occur. For instance, using `Random rng = Random(x + y);` will _not_ work properly (unless you want strange mirroring down the diagonal in your noise map). There's numerous landmines to avoid when doing something like this. Some approaches may not be perfect, but depending on what you're doing they may be sufficient, however.

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by monarch_dodra
in reply to Chris Cain

monarch_dodra

Posted in reply to Chris Cain

On Friday, 3 January 2014 at 01:43:09 UTC, Chris Cain wrote:
> So, it sounds like the OP is using the x and y coords for a seed to generate a single number and he was curious to whether it costs too much to reseed like this for every point.
>
> FWIW, I'm under the impression that this is a fairly common strategy, but usually when I've seen this used more than one number is generated at a time.

I *thought* he wanted to do something like that, but was surprised by the fact he wanted to reseed *per* element...

> You can still do this, in this case. For example, divide x by 10 and generate 10 elements (column wise) in the noise map each time and it reduces the number of reseeds by a factor of 10. Some effort might be wasted, but as long as you need a decent chunk of the noise map at any one point in time, this should work out pretty well in practice.
>
> My biggest concern with this approach is that you must take care that your usage of seeding with x and y coordinates does not cause repeated elements to occur. For instance, using `Random rng = Random(x + y);` will _not_ work properly (unless you want strange mirroring down the diagonal in your noise map). There's numerous landmines to avoid when doing something like this. Some approaches may not be perfect, but depending on what you're doing they may be sufficient, however.

The approach I'd take here is to eagerly generate a "uint[X / 1000][Y / 1000]" grid, that holds randomly generated numbers, to be used as seeds for individual 1000*1000 blocks of the noise map. I don't know how good that is though in terms of correlation...?

Or, even better, to create a single generator of type "Gen", and store a "Gen[X / 1000][Y / 1000]": EG: the generator, with a "checkpoint" every 1_000_000 elements. This should reduce correlation down to 0.

AFAIK, all generators above the "linear congruential generator" have a period above 10^12, so this approach should be fine. The only issue with such an approach might be the size of the PRNG's themselves: XOrShift, for example, is only a few bytes big, so that's fine. But MT is about 700B, which is a hefty amount to chug along.

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by Jeroen Bollen
in reply to monarch_dodra

Jeroen Bollen

Posted in reply to monarch_dodra

On Friday, 3 January 2014 at 10:06:27 UTC, monarch_dodra wrote:
> On Friday, 3 January 2014 at 01:43:09 UTC, Chris Cain wrote:
>> So, it sounds like the OP is using the x and y coords for a seed to generate a single number and he was curious to whether it costs too much to reseed like this for every point.
>>
>> FWIW, I'm under the impression that this is a fairly common strategy, but usually when I've seen this used more than one number is generated at a time.
>
> I *thought* he wanted to do something like that, but was surprised by the fact he wanted to reseed *per* element...
>
>> You can still do this, in this case. For example, divide x by 10 and generate 10 elements (column wise) in the noise map each time and it reduces the number of reseeds by a factor of 10. Some effort might be wasted, but as long as you need a decent chunk of the noise map at any one point in time, this should work out pretty well in practice.
>>
>> My biggest concern with this approach is that you must take care that your usage of seeding with x and y coordinates does not cause repeated elements to occur. For instance, using `Random rng = Random(x + y);` will _not_ work properly (unless you want strange mirroring down the diagonal in your noise map). There's numerous landmines to avoid when doing something like this. Some approaches may not be perfect, but depending on what you're doing they may be sufficient, however.
>
> The approach I'd take here is to eagerly generate a "uint[X / 1000][Y / 1000]" grid, that holds randomly generated numbers, to be used as seeds for individual 1000*1000 blocks of the noise map. I don't know how good that is though in terms of correlation...?
>
> Or, even better, to create a single generator of type "Gen", and store a "Gen[X / 1000][Y / 1000]": EG: the generator, with a "checkpoint" every 1_000_000 elements. This should reduce correlation down to 0.
>
> AFAIK, all generators above the "linear congruential generator" have a period above 10^12, so this approach should be fine. The only issue with such an approach might be the size of the PRNG's themselves: XOrShift, for example, is only a few bytes big, so that's fine. But MT is about 700B, which is a hefty amount to chug along.

I already considered this, but the problem is, I need to smoothen the noise, and to do that I need all surrounding 'checkpoints' too. This means that it'll have to load in 5 at a time.

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by Andrea Fontana
in reply to Frustrated

Andrea Fontana

Posted in reply to Frustrated

On Friday, 3 January 2014 at 01:01:21 UTC, Frustrated wrote:
> On Thursday, 2 January 2014 at 20:38:10 UTC, Jeroen Bollen wrote:
> [...]
> e.g.,
>
> seed(k);
> for(i = 1..10)
> print(rnd(i));
>
> and
>
> for(i = 1..10)
> {
> seed(time);
> print(rnd(i));
> }
>
> will both produce random sequences of numbers(and random sequences of numbers are "identically random".
> [...]

The second example is error-prone. If "time" var doesn't change between cycle, it's not random at all.

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by monarch_dodra
in reply to Jeroen Bollen

monarch_dodra

Posted in reply to Jeroen Bollen

On Friday, 3 January 2014 at 13:30:09 UTC, Jeroen Bollen wrote:
> I already considered this, but the problem is, I need to smoothen the noise, and to do that I need all surrounding 'checkpoints' too. This means that it'll have to load in 5 at a time.

I don't see that as a problem. Just because you can load sub-regions at once, doesn't mean you are limited to only loading one at a time.

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by Jeroen Bollen
in reply to monarch_dodra

Jeroen Bollen

Posted in reply to monarch_dodra

On Friday, 3 January 2014 at 13:42:19 UTC, monarch_dodra wrote:
> On Friday, 3 January 2014 at 13:30:09 UTC, Jeroen Bollen wrote:
>> I already considered this, but the problem is, I need to smoothen the noise, and to do that I need all surrounding 'checkpoints' too. This means that it'll have to load in 5 at a time.
>
> I don't see that as a problem. Just because you can load sub-regions at once, doesn't mean you are limited to only loading one at a time.

That still means that out of 5 million pixels loaded, only 1 million 4 thousand will be used. I guess I can recycle them though.

January 03, 2014

Re: Is continuously seeding a random number generator performance intensive?

Posted by monarch_dodra
in reply to Jeroen Bollen

monarch_dodra

Posted in reply to Jeroen Bollen

On Friday, 3 January 2014 at 17:41:48 UTC, Jeroen Bollen wrote:
> On Friday, 3 January 2014 at 13:42:19 UTC, monarch_dodra wrote:
>> On Friday, 3 January 2014 at 13:30:09 UTC, Jeroen Bollen wrote:
>>> I already considered this, but the problem is, I need to smoothen the noise, and to do that I need all surrounding 'checkpoints' too. This means that it'll have to load in 5 at a time.
>>
>> I don't see that as a problem. Just because you can load sub-regions at once, doesn't mean you are limited to only loading one at a time.
>
> That still means that out of 5 million pixels loaded, only 1 million 4 thousand will be used. I guess I can recycle them though.

You could also do a "subdivision" approach:

First, a table that contains seeds for 1Kx1K blocks... However each seed is designed to seed 10000 new seeds, to generate 10x10 blocks (for example).

This way, you can first load your big 1Kx1K block, and then 400 10x10 blocks. Seems reasonable to me.

I don't know exactly how big your data is, so your mileage may vary. Depending on your algorithm, you may have to adapt the numbers, or even amount of subdivisions.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation