Thread overview | |||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
August 22, 2016 [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Hey all, I am proud to publish a report of my GSoC work as two extensive blog posts, which explain non-uniform random sampling and the mir.random.flex package (part of Mir > 0.16-beta2): http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html Before I start my personal retrospect, I wanted to use this opportunity to give a huge thanks and acknowledgement to my two awesome mentors Ilya Yaroshenko (9il) and Joseph Wakeling (WebDrake). As I wrote my first line of D code this February, I have learned quite a lot during the last few months. Github allows to list all merged contributions, which might show that I got quite familiar with dlang over the time: https://github.com/search?l=&o=desc&q=author%3Awilzbach+is%3Amerged+user%3Adlang&ref=advsearch&s=comments&type=Issues&utf8=%E2%9C%93 … and with other D repositories: https://github.com/search?l=D&o=desc&q=author%3Awilzbach+is%3Amerged&ref=searchresults&s=comments&type=Issues&utf8=%E2%9C%93 I am pretty sure you now know me from the NG, Github, IRC, Twitter, Bugzilla, DConf16, the DWiki, #d at StackOverflow or /r/d_language, so I will skip a further introduction ;-) Over the next weeks and months I will continue my work on mir.random, which is supposed to supersede std.random, so in case you aren’t following the Mir project [1, 2], stay tuned! Best regards, Seb [1] https://github.com/libmir/mir [2] https://twitter.com/libmir |
August 22, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Seb | On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote:
> Hey all,
>
> I am proud to publish a report of my GSoC work as two extensive blog posts, which explain non-uniform random sampling and the mir.random.flex package (part of Mir > 0.16-beta2):
>
> http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html
> http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html
>
>
Thanks for the well-done blog posts, especially the first one.
Does your implementation make any use of CTFE?
|
August 22, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Seb | On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote: > Hey all, > > I am proud to publish a report of my GSoC work as two extensive blog posts, which explain non-uniform random sampling and the mir.random.flex package (part of Mir > 0.16-beta2): > > http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html > http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html It's really nice to see that GSoC has been such a huge success so far. Everyone has done some really great work. > Over the next weeks and months I will continue my work on mir.random, which is supposed to supersede std.random, so in case you aren’t following the Mir project [1, 2], stay tuned! > > Best regards, > > Seb > > [1] https://github.com/libmir/mir > [2] https://twitter.com/libmir I'm curious, have you come up with a solution to what is probably the biggest problem with std.random, i.e., it uses value types and copying? I remember a lot of discussion about this and it seemed at the time that the only really solid solution was to make all random generators classes, though I think DIP1000 *may* help here. |
August 23, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Seb | On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote: > Hey all, > > I am proud to publish a report of my GSoC work as two extensive blog posts, which explain non-uniform random sampling and the mir.random.flex package (part of Mir > 0.16-beta2): > > http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html > http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html Reddit: https://www.reddit.com/r/programming/comments/4z4sp7/an_introduction_to_nonuniform_random_sampling/ |
August 23, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Meta | On Monday, 22 August 2016 at 18:09:28 UTC, Meta wrote:
> On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote:
>> Hey all,
>>
>> I am proud to publish a report of my GSoC work as two extensive blog posts, which explain non-uniform random sampling and the mir.random.flex package (part of Mir > 0.16-beta2):
>>
>> http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html
>> http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html
>
> It's really nice to see that GSoC has been such a huge success so far. Everyone has done some really great work.
>
>
>> Over the next weeks and months I will continue my work on mir.random, which is supposed to supersede std.random, so in case you aren’t following the Mir project [1, 2], stay tuned!
>>
>> Best regards,
>>
>> Seb
>>
>> [1] https://github.com/libmir/mir
>> [2] https://twitter.com/libmir
>
> I'm curious, have you come up with a solution to what is probably the biggest problem with std.random, i.e., it uses value types and copying? I remember a lot of discussion about this and it seemed at the time that the only really solid solution was to make all random generators classes, though I think DIP1000 *may* help here.
This is an API problem, and will not be fixed. Making D scripting like language is bad for Science. For example, druntime (Fibers and Mutexes) is useless because it is too high level and poor featured in the same time.
The main problem with std.random is that std.random.uniform is broken in context of non-uniform sampling. The same situation is for 99% uniform algorithms. They just ignore the fact that for example, for [0, 1) exponent and mantissa should be generated separately with appropriate probabilities for for exponent
|
August 23, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Seb | On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote:
> I am proud to publish a report of my GSoC work as two extensive blog posts, which explain non-uniform random sampling and the mir.random.flex package (part of Mir > 0.16-beta2):
Fantastic work!
|
August 23, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Seb | On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote:
> http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html
> http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html
Found at typo:
Search for "performance boost performance boost"
|
August 23, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Ilya Yaroshenko | On Tuesday, 23 August 2016 at 05:40:24 UTC, Ilya Yaroshenko wrote: > This is an API problem, and will not be fixed. Making D scripting like language is bad for Science. For example, druntime (Fibers and Mutexes) is useless because it is too high level and poor featured in the same time. Yes, this is not an issue that is immediately fixable without introducing other issues (e.g. defining everything as a class brings immediate issues related to heap allocation). In the long run it would obviously be nice to address that issue, but it would have been a major blocker to throw that onto Seb's shoulders (as we all recognized quite quickly when we started discussing it). It was (rightly) not the focus of this project. For this reason, the random distributions introduced in mir are implemented as functors (as is the case with random distributions in C++11 <random>) rather than as ranges. > The main problem with std.random is that std.random.uniform is broken in context of non-uniform sampling. The same situation is for 99% uniform algorithms. They just ignore the fact that for example, for [0, 1) exponent and mantissa should be generated separately with appropriate probabilities for for exponent Just as a point of terminology: we should make clear here that this is about sampling from a non-uniform distribution. It shouldn't be confused with "sampling" in the sense of what is done by (say) `RandomSample`. |
August 23, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to jmh530 | On Monday, 22 August 2016 at 17:13:10 UTC, jmh530 wrote: > On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote: >> Hey all, >> >> I am proud to publish a report of my GSoC work as two extensive blog posts, which explain non-uniform random sampling and the mir.random.flex package (part of Mir > 0.16-beta2): >> >> http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html >> http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html >> >> > > Thanks for the well-done blog posts, especially the first one. I am glad to hear this! > Does your implementation make any use of CTFE? If you refer to whether the intervals can be calculated at CT, unfortunately it can't be used due to four main reasons: - FP-math at CT (it's already hard to deal with at RT, see e.g. my recent complaint [1]) - the problem is that the Flex algorithm is very sensitive to numerical errors and thus an erroneous change at the lowest end (e.g 10^-15) can lead to totally different numbers with a seeded random engine - std.container due to pointers (I doubt this can/will be fixed in the near future) - std.math due to inline assembly and other tricks (this can be fixed and I will submit a couple of PRs soon) - speed of the CTFE engine (see e.g. [2] for std.regex) That being said CTFE is of course used to compute mixins, constants and specialize functions. Moreover thanks to all speed-ups described in the second blog, constructing the intervals takes about 0.1ms, so for the majority of the users it shouldn't even be noticeable and for the tiny minority it does, they can still manually inline the intervals. [1] http://forum.dlang.org/post/hjaiavlfkoamenidomsa@forum.dlang.org [2] http://forum.dlang.org/post/iqcrnokalollrejcabad@forum.dlang.org |
August 23, 2016 Re: [GSoC] Mir.random.flex - Generic non-uniform random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Nordlöw | On Tuesday, 23 August 2016 at 08:10:50 UTC, Nordlöw wrote: > On Monday, 22 August 2016 at 15:34:47 UTC, Seb wrote: >> http://blog.mir.dlang.io/random/2016/08/19/intro-to-random-sampling.html >> http://blog.mir.dlang.io/random/2016/08/22/transformed-density-rejection-sampling.html > > Found at typo: > > Search for "performance boost performance boost" Thanks! Fixed. Btw in case someone is interested, the blog posts are written in Github-flavored Markdown with a couple of custom Jekyll extensions (e.g. the Math formulas are rendered on the server with KaTeX [1]): https://github.com/libmir/blog/blob/master/_posts/2016-08-19-transformed-density-rejection-sampling.md [1] https://khan.github.io/KaTeX/ |
Copyright © 1999-2021 by the D Language Foundation