Thread overview
[phobos] fastDice
Sep 11, 2010
David Simcha
Jan 02, 2011
David Simcha
Jan 02, 2011
David Simcha
Jan 03, 2011
Masahiro Nakagawa
Jan 03, 2011
Masahiro Nakagawa
September 11, 2010
  I've written a few functions that generate random numbers from an
arbitrary discrete distribution in O(log N) time, where N is the number
of possible values, using SortedRange.lowerBound().  It's similar to
dice() except that in exchange for O(N) auxiliary space and upfront
initialization cost you get O(log N) generation.  I've attached the
code, which is fairly simple.  Should this go in std.random, or is
needing this O(log N) performance on dice() niche enough that this
belongs in my dstats lib instead?
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: fastDice.d
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100911/110a4d41/attachment.ksh>
January 02, 2011
On 9/11/10 3:29 PM, David Simcha wrote:
> I've written a few functions that generate random numbers from an
> arbitrary discrete distribution in O(log N) time, where N is the number
> of possible values, using SortedRange.lowerBound(). It's similar to
> dice() except that in exchange for O(N) auxiliary space and upfront
> initialization cost you get O(log N) generation. I've attached the code,
> which is fairly simple. Should this go in std.random, or is needing this
> O(log N) performance on dice() niche enough that this belongs in my
> dstats lib instead?

I'm not sure. You may want to ask the question on the newsgroup.

BTW, I think std.random is in dire need of a few classic distribution (Gaussian, Poisson, Zipf come to mind). Anyone inclined?


Andrei
January 02, 2011
A decent Gaussian is trivial:

http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform

Also see my dstats.random lib.  Most of this stuff requires binary attribution because it was translated from the Numpy code, but the Box-Muller Gaussian is arguably not original enough to be copyrightable, i.e. you can't copyright algorithms, Box-Muller is a very well-known algorithm, and there's a very limited number of ways to express the algorithm in code once it's specified.

Poisson is a PITA IIRC.  (I have code to do it, but again it requires binary attribution.)

Where is Zipf used?  I left it out of my Numpy port that I did awhile back.  IIRC it was because Wikipedia didn't provide enough information on it to make my code reasonably testable.

On 1/2/2011 5:59 PM, Andrei Alexandrescu wrote:
> On 9/11/10 3:29 PM, David Simcha wrote:
>> I've written a few functions that generate random numbers from an
>> arbitrary discrete distribution in O(log N) time, where N is the number
>> of possible values, using SortedRange.lowerBound(). It's similar to
>> dice() except that in exchange for O(N) auxiliary space and upfront
>> initialization cost you get O(log N) generation. I've attached the code,
>> which is fairly simple. Should this go in std.random, or is needing this
>> O(log N) performance on dice() niche enough that this belongs in my
>> dstats lib instead?
>
> I'm not sure. You may want to ask the question on the newsgroup.
>
> BTW, I think std.random is in dire need of a few classic distribution (Gaussian, Poisson, Zipf come to mind). Anyone inclined?
>
>
> Andrei
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>

January 02, 2011
Also, Boost has great random numbers. For now, could you be the leader on putting Gaussians into Phobos?

Andrei

On 1/2/11 5:23 PM, David Simcha wrote:
> A decent Gaussian is trivial:
>
> http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform
>
> Also see my dstats.random lib. Most of this stuff requires binary attribution because it was translated from the Numpy code, but the Box-Muller Gaussian is arguably not original enough to be copyrightable, i.e. you can't copyright algorithms, Box-Muller is a very well-known algorithm, and there's a very limited number of ways to express the algorithm in code once it's specified.
>
> Poisson is a PITA IIRC. (I have code to do it, but again it requires
> binary attribution.)
>
> Where is Zipf used? I left it out of my Numpy port that I did awhile back. IIRC it was because Wikipedia didn't provide enough information on it to make my code reasonably testable.
>
> On 1/2/2011 5:59 PM, Andrei Alexandrescu wrote:
>> On 9/11/10 3:29 PM, David Simcha wrote:
>>> I've written a few functions that generate random numbers from an
>>> arbitrary discrete distribution in O(log N) time, where N is the number
>>> of possible values, using SortedRange.lowerBound(). It's similar to
>>> dice() except that in exchange for O(N) auxiliary space and upfront
>>> initialization cost you get O(log N) generation. I've attached the code,
>>> which is fairly simple. Should this go in std.random, or is needing this
>>> O(log N) performance on dice() niche enough that this belongs in my
>>> dstats lib instead?
>>
>> I'm not sure. You may want to ask the question on the newsgroup.
>>
>> BTW, I think std.random is in dire need of a few classic distribution (Gaussian, Poisson, Zipf come to mind). Anyone inclined?
>>
>>
>> Andrei
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
January 02, 2011
Good idea.  Didn't think of Boost, and obviously licensing issues won't be problematic there.  Will do, since it will probably just be a trivial translation job.

On 1/2/2011 6:36 PM, Andrei Alexandrescu wrote:
> Also, Boost has great random numbers. For now, could you be the leader on putting Gaussians into Phobos?
>
> Andrei
>
> On 1/2/11 5:23 PM, David Simcha wrote:
>> A decent Gaussian is trivial:
>>
>> http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform
>>
>> Also see my dstats.random lib. Most of this stuff requires binary attribution because it was translated from the Numpy code, but the Box-Muller Gaussian is arguably not original enough to be copyrightable, i.e. you can't copyright algorithms, Box-Muller is a very well-known algorithm, and there's a very limited number of ways to express the algorithm in code once it's specified.
>>
>> Poisson is a PITA IIRC. (I have code to do it, but again it requires
>> binary attribution.)
>>
>> Where is Zipf used? I left it out of my Numpy port that I did awhile back. IIRC it was because Wikipedia didn't provide enough information on it to make my code reasonably testable.
>>
>> On 1/2/2011 5:59 PM, Andrei Alexandrescu wrote:
>>> On 9/11/10 3:29 PM, David Simcha wrote:
>>>> I've written a few functions that generate random numbers from an
>>>> arbitrary discrete distribution in O(log N) time, where N is the
>>>> number
>>>> of possible values, using SortedRange.lowerBound(). It's similar to
>>>> dice() except that in exchange for O(N) auxiliary space and upfront
>>>> initialization cost you get O(log N) generation. I've attached the
>>>> code,
>>>> which is fairly simple. Should this go in std.random, or is needing
>>>> this
>>>> O(log N) performance on dice() niche enough that this belongs in my
>>>> dstats lib instead?
>>>
>>> I'm not sure. You may want to ask the question on the newsgroup.
>>>
>>> BTW, I think std.random is in dire need of a few classic distribution (Gaussian, Poisson, Zipf come to mind). Anyone inclined?
>>>
>>>
>>> Andrei
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>>>
>>
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>

January 03, 2011
I want more generators.
MinstdRand is lightweight but poor.
MT is good period but very large.

I often use Xorshift(128bit) generator. https://bitbucket.org/repeatedly/scrap/src/0d5acf2d18f4/xorshift.d

Of course, Boost.Random has other various generators.


Masahiro

On Mon, 03 Jan 2011 07:59:29 +0900, Andrei Alexandrescu <andrei at erdani.com> wrote:

> On 9/11/10 3:29 PM, David Simcha wrote:
>> I've written a few functions that generate random numbers from an
>> arbitrary discrete distribution in O(log N) time, where N is the number
>> of possible values, using SortedRange.lowerBound(). It's similar to
>> dice() except that in exchange for O(N) auxiliary space and upfront
>> initialization cost you get O(log N) generation. I've attached the code,
>> which is fairly simple. Should this go in std.random, or is needing this
>> O(log N) performance on dice() niche enough that this belongs in my
>> dstats lib instead?
>
> I'm not sure. You may want to ask the question on the newsgroup.
>
> BTW, I think std.random is in dire need of a few classic distribution (Gaussian, Poisson, Zipf come to mind). Anyone inclined?
>
>
> Andrei
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
January 02, 2011
That looks addable to Phobos immediately. You may want to just add it when you have the time.

Andrei

On 1/2/11 6:37 PM, Masahiro Nakagawa wrote:
> I want more generators.
> MinstdRand is lightweight but poor.
> MT is good period but very large.
>
> I often use Xorshift(128bit) generator. https://bitbucket.org/repeatedly/scrap/src/0d5acf2d18f4/xorshift.d
>
> Of course, Boost.Random has other various generators.
>
>
> Masahiro
>
> On Mon, 03 Jan 2011 07:59:29 +0900, Andrei Alexandrescu <andrei at erdani.com> wrote:
>
>> On 9/11/10 3:29 PM, David Simcha wrote:
>>> I've written a few functions that generate random numbers from an
>>> arbitrary discrete distribution in O(log N) time, where N is the number
>>> of possible values, using SortedRange.lowerBound(). It's similar to
>>> dice() except that in exchange for O(N) auxiliary space and upfront
>>> initialization cost you get O(log N) generation. I've attached the code,
>>> which is fairly simple. Should this go in std.random, or is needing this
>>> O(log N) performance on dice() niche enough that this belongs in my
>>> dstats lib instead?
>>
>> I'm not sure. You may want to ask the question on the newsgroup.
>>
>> BTW, I think std.random is in dire need of a few classic distribution (Gaussian, Poisson, Zipf come to mind). Anyone inclined?
>>
>>
>> Andrei
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
January 03, 2011
OK. I will add in the weekend.


Masahiro

On Mon, 03 Jan 2011 10:11:30 +0900, Andrei Alexandrescu <andrei at erdani.com> wrote:

> That looks addable to Phobos immediately. You may want to just add it when you have the time.
>
> Andrei
>
> On 1/2/11 6:37 PM, Masahiro Nakagawa wrote:
>> I want more generators.
>> MinstdRand is lightweight but poor.
>> MT is good period but very large.
>>
>> I often use Xorshift(128bit) generator. https://bitbucket.org/repeatedly/scrap/src/0d5acf2d18f4/xorshift.d
>>
>> Of course, Boost.Random has other various generators.
>>
>>
>> Masahiro
>>
>> On Mon, 03 Jan 2011 07:59:29 +0900, Andrei Alexandrescu <andrei at erdani.com> wrote:
>>
>>> On 9/11/10 3:29 PM, David Simcha wrote:
>>>> I've written a few functions that generate random numbers from an
>>>> arbitrary discrete distribution in O(log N) time, where N is the
>>>> number
>>>> of possible values, using SortedRange.lowerBound(). It's similar to
>>>> dice() except that in exchange for O(N) auxiliary space and upfront
>>>> initialization cost you get O(log N) generation. I've attached the
>>>> code,
>>>> which is fairly simple. Should this go in std.random, or is needing
>>>> this
>>>> O(log N) performance on dice() niche enough that this belongs in my
>>>> dstats lib instead?
>>>
>>> I'm not sure. You may want to ask the question on the newsgroup.
>>>
>>> BTW, I think std.random is in dire need of a few classic distribution (Gaussian, Poisson, Zipf come to mind). Anyone inclined?
>>>
>>>
>>> Andrei
>>> _______________________________________________
>>> phobos mailing list
>>> phobos at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/phobos
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos