[OT] Algorithm question (page 2)

May 01, 2017

Re: [OT] Algorithm question

Posted by Ola Fosheim Grøstad
in reply to Ola Fosheim Grøstad

Permalink

Ola Fosheim Grøstad

Posted in reply to Ola Fosheim Grøstad

Permalink

On Monday, 1 May 2017 at 08:44:25 UTC, Ola Fosheim Grøstad wrote:
> Does it exist? Yes, because you can build permutation tables for it, but you'll have to find the closed form for it that is efficient... Can you do that without selecting a specific N? It is easy for N=2 and N=3 at least.
>
> E.g.
>
> For array length 2 you get the 2 permutations: 0 1, 1 0
> Then you just select one of them at random (you know the number of permutations)
>
> So if it exists you'll should probably do a search for permutations. "How do I efficiently generate permutation number N"
>
> Of course for smaller N you could generate a big array of bytes and compress it by having arrays of slices into it (as many permutations will share index sequences).

Keep also in mind that you can split the original array in two, so it might be sufficient to  have the permutations for various 2^M sizes if those are easier to generate (my guess, maybe through some xor magic?)

E.g. if N = 21 then it can be split into 16 + (4 + 1)

So you can draw like this:

1. randomly select group_16 over group_5 based with 16/21 probability)

2. if group_5 selected: ramdomly select group_4 over group 1 on 4/5 probability

etc.

That way you only need log(N) permutation sequences to keep track of. Which for all practical purposes are going to stay pretty low (<20?)

Right?

On Monday, 1 May 2017 at 10:51:46 UTC, Ola Fosheim Grøstad wrote: > Keep also in mind that you can split the original array in two, so it might be sufficient to have the permutations for various 2^M sizes if those are easier to generate (my guess, maybe through some xor magic?) > > E.g. if N = 21 then it can be split into 16 + (4 + 1) > > So you can draw like this: > > 1. randomly select group_16 over group_5 based with 16/21 probability) > > 2. if group_5 selected: ramdomly select group_4 over group 1 on 4/5 probability > > etc. > > That way you only need log(N) permutation sequences to keep track of. Which for all practical purposes are going to stay pretty low (<20?) > > Right? But if you don't use a hardware random generator and are happy with more sloppy pseudo-randomness then you would be better of ditching the whole concept of permutations and replace it with some cylic random generators intead using the same technique I just outlined. E.g. find a set of cyclic random generators and break down N into a sum of these cycle-sizes, then just keep track of the seed for each. If they are 2^N then you could use xor to get more randomness between runs. Also in the algorithms above you need to change the probabilities each time you take away one index from a group (you don't want to draw from an empty group).

On Monday, 1 May 2017 at 11:08:56 UTC, Ola Fosheim Grøstad wrote: > E.g. find a set of cyclic random generators and break down N into a sum of these cycle-sizes, then just keep track of the seed for each. If they are 2^N then you could use xor to get more randomness between runs. > > Also in the algorithms above you need to change the probabilities each time you take away one index from a group (you don't want to draw from an empty group). Well, actually, just select the single cyclic generator that is larger or equal to N, then just redraw if it returns a value >= N. Duh! Sloppy thinking. I would think you should be able to find some prime sized ones that will get the next index with an insignificant number of redraws. But permutations is the way to go if you want scientific quality.

On Monday, 1 May 2017 at 04:15:35 UTC, H. S. Teoh wrote: > I'm been thinking about the following problem, and thought I'd pick the brains of the bright people around these parts... > > [...] Since most real world problems would require selecting elements more than once it may be far more efficient to sort by P(x)(filter) then simply select a random element. e.g., a = A.filter(x->P(x)) // Creates a new set a where only the elements of A that satisfy P(x) are added ... e = a.random; this is O(1) if you only have to filter once(you can create a container that always "sorts" on P(x) so to speak.. like a sorted dictionary).

On Monday, 1 May 2017 at 11:15:28 UTC, Ola Fosheim Grøstad wrote: > But permutations is the way to go if you want scientific quality. I take that back: «Polynomial pseudo-random number generator via cyclic phase» http://www.sciencedirect.com/science/article/pii/S0378475409001463 Might be what you are looking for and if not it probably references other papers that fits the bill.

Ah, found a sensible search term: «pseudorandom permutation» http://cse.iitkgp.ac.in/~debdeep/courses_iitkgp/FCrypto/slides/LubyRackoff.pdf If you go down this path, please share links. Useful stuff actually.

On Monday, 1 May 2017 at 04:15:35 UTC, H. S. Teoh wrote: > 1) "Random shooting": > > auto r = ... /* elements of A */ > for (;;) { > auto i = uniform(0, r.length); > if (P(r[i])) return r[i]; > } > > Advantages: If a large percentage of elements in A satisfy P(x), then > the loop will terminate within a small number of iterations; if the > majority of elements satisfy P(x), this will terminate well below n > iterations. > > Disadvantages: If only a small percentage of elements satisfy P(x), > then this loop could take arbitrarily long to terminate, and it will > not terminate at all if no elements satisfy P(x). > If you find an element that doesn't satisfy P(x) move it on top of array (swap!) and then try again with uniform(1, r.length - 1) and so on. It terminates in this way. Andrea

On Monday, 1 May 2017 at 11:57:32 UTC, Ola Fosheim Grøstad wrote: > Ah, found a sensible search term: «pseudorandom permutation» > > http://cse.iitkgp.ac.in/~debdeep/courses_iitkgp/FCrypto/slides/LubyRackoff.pdf > > If you go down this path, please share links. Useful stuff actually. A readable paper for most people: http://www.pcg-random.org/pdf/toms-oneill-pcg-family-v1.02.pdf «some users would prefer a much stronger property analogous to uniformity: that over the full period of the generator, every possible k-tuple will occur, and it will occur the same number of times.» PCG claims to have arbitrary cycle sizes. I haven't looked at it, but it is available as C/C++.

On Monday, 1 May 2017 at 12:31:48 UTC, Andrea Fontana wrote: > If you find an element that doesn't satisfy P(x) move it on top of array (swap!) and then try again with uniform(1, r.length - 1) and so on. Whoops i mean uniform(1, r.length) of course :)

On Monday, 1 May 2017 at 12:38:32 UTC, Andrea Fontana wrote: > On Monday, 1 May 2017 at 12:31:48 UTC, Andrea Fontana wrote: >> If you find an element that doesn't satisfy P(x) move it on top of array (swap!) and then try again with uniform(1, r.length - 1) and so on. > > > Whoops i mean uniform(1, r.length) of course :) I guess you meant to the end, but if mutating the array is ok, then you don't have to swap. Just move in the last element and chop 1 off the length. Another cache-friendly mutation solution (assuming PRNG is cheap) is to have two phases: Phase 1: linear walkover the array and select elements with P(a[i])==true with probability 1/N the overwrite the ones with P(a[i]) = false. when done truncate the array. Phase 2: regular random selection from the array were all elements satisfy P. The OP underspecified the requirements... There is a gazillion variations... :-P

Forums