November 06, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to jerro | Sorry for delayed response, last week was an unfortunate mix of getting sick and then having a bunch of work-related stuff ... On 10/26/2012 11:03 PM, jerro wrote: >> Well, as I defined the function, the UniformRNG input has a default value of >> rndGen, so if you call it just as normal(mean, sigma); it should work as >> intended. But if there's some factor which means it would be better to define >> a separate function which doesn't receive the RNG as input, I can do that. > > I don't see any downside to this. Which "this" do you mean? My current approach, or the adding of an extra separate function? :-) > I have only been thinking about the Ziggurat algorithm, but you are right, it > does depend on the details of the technique. For Box-Muller (and other engines > that cache samples) it only makes sense to compute the first samples in the > opCall. But for the Ziggurat algorithm, tables that must be computed before you > can start sampling aren't changed during sampling and computing the tables > doesn't require any additional arguments. So it makes the most sense for those > tables to be initialized in the struct's constructor in the struct based API. So we should assume by default then that the struct's constructor should take an RNG as input, to enable it to calculate these first values if it needs to? > In my previous post I was talking about initializing static instances of the > engine used in the normal() function. The advantage of initializing in a static > constructor is that you don't need an additional check every time the normal() > function is called. But because we will also have a struct based API, that will > not require such checks (at least not for all engines), this isn't really that > important. So we can also initialize the global engine instance in a call to > normal(), if this simplifies things. I guess my feeling here is that the values generated by an RNG should depend on when it is called, and not at all on when it is instantiated. i.e. if I do something like auto nrng = Normal!()(0, 1); writeln( uniform(0.0, 1.0) ); writeln( uniform(0.0, 1.0) ); writeln( nrng() ); writeln( nrng() ); I should get the same output as if I do, writeln( uniform(0.0, 1.0) ); writeln( uniform(0.0, 1.0) ); auto nrng = Normal!()(0, 1); writeln( nrng() ); writeln( nrng() ); You can also think that if I change from e.g. auto nrng = Normal!(real, Engine1)(0, 1); writeln( uniform(0.0, 1.0) ); writeln( uniform(0.0, 1.0) ); writeln( nrng() ); writeln( nrng() ); to auto nrng = Normal!(real, Engine2)(0, 1); writeln( uniform(0.0, 1.0) ); writeln( uniform(0.0, 1.0) ); writeln( nrng() ); writeln( nrng() ); ... then I would expect to see different results from the normal RNG but identical results from uniform(). If the constructor of the normal engine calls the RNG, the uniform() results will change, no? >> I'm going to turn that into patches for Phobos which I'll put up on GitHub in >> the next days, so we can pull and push and test/write together as needed. > > Maybe we should also have a separate file in a separate branch for tests. There > will probably be a lot of code needed to test this well and the tests could take > a long time to run, so I don't think they should go into unit test blocks (we > can put some simpler tests in unit test blocks later). It may also be useful to > use external libraries such as dstats for tests. Yes, a random.d test suite probably should be another project. Regardless of tests, let's focus for now on getting the API right for this case of non-uniform-random-number-generator-with-internal-engine, with normal and exponential as the initial cases. We can always label some engines as "use at own risk" in the short term! |
November 06, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | >> I don't see any downside to this. > > Which "this" do you mean? My current approach, or the adding of an extra separate function? :-) Your current approach. >> I have only been thinking about the Ziggurat algorithm, but you are right, it >> does depend on the details of the technique. For Box-Muller (and other engines >> that cache samples) it only makes sense to compute the first samples in the >> opCall. But for the Ziggurat algorithm, tables that must be computed before you >> can start sampling aren't changed during sampling and computing the tables >> doesn't require any additional arguments. So it makes the most sense for those >> tables to be initialized in the struct's constructor in the struct based API. > So we should assume by default then that the struct's constructor should take an RNG as input, to enable it to calculate these first values if it needs to? No, I think that if the engine defines initialize() function (with no parameters), it should be called in the constructor of Normal. I don't think the constructor of Normal should take an RNG as input. I think what you currently do in normal.d is fine, I would just add something like this: static if(is(typeof(_engine.initialize()))) _engine.initialize(); to Normal's constructor. Then ZigguratEngine can define initialize() and do its initialization there (it doesn't need a uniform RNG for initialization) and NormalBoxMullerEngine can remain unchanged. >> In my previous post I was talking about initializing static instances of the >> engine used in the normal() function. The advantage of initializing in a static >> constructor is that you don't need an additional check every time the normal() >> function is called. But because we will also have a struct based API, that will >> not require such checks (at least not for all engines), this isn't really that >> important. So we can also initialize the global engine instance in a call to >> normal(), if this simplifies things. > I guess my feeling here is that the values generated by an RNG should depend on when it is called, and not at all on when it is instantiated. > > i.e. if I do something like > > auto nrng = Normal!()(0, 1); > writeln( uniform(0.0, 1.0) ); > writeln( uniform(0.0, 1.0) ); > writeln( nrng() ); > writeln( nrng() ); > > I should get the same output as if I do, > > writeln( uniform(0.0, 1.0) ); > writeln( uniform(0.0, 1.0) ); > auto nrng = Normal!()(0, 1); > writeln( nrng() ); > writeln( nrng() ); > > You can also think that if I change from e.g. > > auto nrng = Normal!(real, Engine1)(0, 1); > writeln( uniform(0.0, 1.0) ); > writeln( uniform(0.0, 1.0) ); > writeln( nrng() ); > writeln( nrng() ); > > to > > auto nrng = Normal!(real, Engine2)(0, 1); > writeln( uniform(0.0, 1.0) ); > writeln( uniform(0.0, 1.0) ); > writeln( nrng() ); > writeln( nrng() ); > > ... then I would expect to see different results from the normal RNG but identical results from uniform(). If the constructor of the normal engine calls the RNG, the uniform() results will change, no? I was only talking about the part of initialization that doesn't use a RNG. I agree that everything that uses a RNG should be done in opCall (or inside a normal() function in the function interface). For Box-Muller, I think the approach you currently use in NormalBoxMullerEngine is the most reasonable one. But a Ziggurat engines needs to compute some tables before it can start generating samples. It doesn't need a RNG to do that and the tables do not change after initialization. I think it's obvious that that initialization that doesn't need a RNG should be done in Normal's constructor for the struct interface. What is not so obvious is where the initialization of static data that doesn't require a RNG should be done for function interface. That initialization can be done in normal() function, or it can be done in a static constructor. If you do it in normal(), you need to do an extra check on each call to normal(). This isn't really a problem as long as the struct's opCall and the version of normal() that takes the engine as a parameter don't do such redundant checks. Then the users that care about the difference in performance can just use one of these interfaces. > Yes, a random.d test suite probably should be another project. > > Regardless of tests, let's focus for now on getting the API right for this case of non-uniform-random-number-generator-with-internal-engine, with normal and exponential as the initial cases. In general I like the API in file normal.d attached to your original post. I think the engines should have an option to do some initialization in Normal's constructor, though. We could achieve that by calling _engine.initialize in Normal's constructors, if such method exists. This method would also need to be called on the static instance of normal engine used in the normal() function. We could add something like this to the first version of normal: static if(is(typeof(engine.initialize()))) { static bool isInitialized; if(!isInitialized) engine.initialize(); } Another option would be to do this: struct GlobalEngine(Engine) { Engine instance; static this() { instance.initialize(); } } And then inside the version of normal that doesn't take an engine as a parameter: alias GlobalEngine!NormalRandomNumberEngine E; return normal(mean, sigma, E.instance, urng); The users would need to construct their own instance of engine to use the function that takes engine as a parameter. So it would make sense to add helper functions for creating engine instances. There's one change that I think would make the API more convenient. Normal struct and the engine don't store an instance of a RNG , so they don't need to take it as a template parameter. We could make opCall methods templates instead. That way the users would never need to explicitly specify the type of the RNG. |
November 06, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to jerro | On 11/06/2012 07:14 PM, jerro wrote: > I was only talking about the part of initialization that doesn't use a RNG. I > agree that everything that uses a RNG should be done in opCall (or inside a > normal() function in the function interface). For Box-Muller, I think the > approach you currently use in NormalBoxMullerEngine is the most reasonable one. > But a Ziggurat engines needs to compute some tables before it can start > generating samples. It doesn't need a RNG to do that and the tables do not > change after initialization. Ahh, OK, clear. Then obviously your suggested approach is correct. > There's one change that I think would make the API more convenient. Normal > struct and the engine don't store an instance of a RNG , so they don't need to > take it as a template parameter. We could make opCall methods templates instead. > That way the users would never need to explicitly specify the type of the RNG. I guess I was concerned about the possibility of a sequence of normal random numbers generated by a mix of different RNGs, but to be honest, if someone really, really wants to do that, I'm not sure I should stop them. :-) |
November 06, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
On 11/06/2012 07:39 PM, Joseph Rushton Wakeling wrote:
> Ahh, OK, clear. Then obviously your suggested approach is correct.
... to initialize in the constructor, that is.
|
November 11, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to jerro | On 11/06/2012 07:14 PM, jerro wrote: > In general I like the API in file normal.d attached to your original post. I > think the engines should have an option to do some initialization in Normal's > constructor, though. We could achieve that by calling _engine.initialize in > Normal's constructors, if such method exists. What's wrong with such initialization being in the constructor of the relevant NormalEngine? I think that was your original idea, and I derailed it because of my misunderstanding of what you wanted to initialize. > The users would need to construct their own instance of engine to use the > function that takes engine as a parameter. So it would make sense to add helper > functions for creating engine instances. In general the idea is that the engine should be something hidden; where you need to use it, you just need to pass the name as a template parameter; it should be rare that you really need to manually instantiate your own engine. But we can add such a helper function if you like. What I'm frustrated about is that as-is it's not possible to just have auto nrng = Normal(mean, sigma); ... but you have instead to write, auto nrng = Normal!()(mean, sigma); despite the fact that Normal has default template parameters given. So maybe a helper function normalRNG which returns an instance of Normal would also be helpful. > There's one change that I think would make the API more convenient. Normal > struct and the engine don't store an instance of a RNG , so they don't need to > take it as a template parameter. We could make opCall methods templates instead. > That way the users would never need to explicitly specify the type of the RNG. Good call. I've uploaded a tweaked version based on your comments here: https://github.com/WebDrake/phobos/tree/normal ... so feel free to pull, further revise and add in your Ziggurat implementation :-) I haven't yet written any of the helper functions you suggest, but will do so shortly based on your response to my above remarks. |
November 11, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | > What's wrong with such initialization being in the constructor of the relevant NormalEngine? I think that was your original idea, and I derailed it because of my misunderstanding of what you wanted to initialize. The problem is that structs can't have constructors with no parameters. > In general the idea is that the engine should be something hidden; where you need to use it, you just need to pass the name as a template parameter; it should be rare that you really need to manually instantiate your own engine. But we can add such a helper function if you like. I agree that those helper functions are not very important, but on the other hand I don't think adding them costs us anything. But maybe we should take care of the other stuff first. > What I'm frustrated about is that as-is it's not possible to just have > > auto nrng = Normal(mean, sigma); > > ... but you have instead to write, > > auto nrng = Normal!()(mean, sigma); > > despite the fact that Normal has default template parameters given. So maybe a helper function normalRNG which returns an instance of Normal would also be helpful. Yes, I think it would be. > I've uploaded a tweaked version based on your comments here: > https://github.com/WebDrake/phobos/tree/normal > > ... so feel free to pull, further revise and add in your Ziggurat implementation :-) I'll add the Zigggurat implementation, probably tomorrow. |
November 11, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to jerro | On 11/11/2012 05:30 PM, jerro wrote: > The problem is that structs can't have constructors with no parameters. Aaack, I'd completely forgotten that. OK, so we can go with the initialize function as you suggested. I suggest you add it in together with Ziggurat. > I agree that those helper functions are not very important, but on the other > hand I don't think adding them costs us anything. But maybe we should take care > of the other stuff first. Well, they should be trivial enough to write, let's get Ziggurat in and working first and then add them. > I'll add the Zigggurat implementation, probably tomorrow. Excellent! Will look forward to seeing that. |
November 20, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Just to update everyone, Jerro and I have been knocking around a few discussions and code tweaks on GitHub: https://github.com/WebDrake/phobos/pull/1#issuecomment-10319570 If anyone wants to comment or give feedback, please feel free. It's surely useful that we get as many concerns covered and addressed before submitting an official Phobos pull request! |
November 20, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | On Tuesday, 20 November 2012 at 15:04:53 UTC, Joseph Rushton Wakeling wrote:
> Just to update everyone, Jerro and I have been knocking around a few discussions and code tweaks on GitHub:
> https://github.com/WebDrake/phobos/pull/1#issuecomment-10319570
>
> If anyone wants to comment or give feedback, please feel free. It's surely useful that we get as many concerns covered and addressed before submitting an official Phobos pull request!
I agree, it's a good idea to ask the community for feedback. I've replied to you in the github thread, but maybe it would be better to continue the discussion here?
|
November 20, 2012 Re: Normal/Gaussian random number generation for D | ||||
---|---|---|---|---|
| ||||
Posted in reply to jerro | On 11/20/2012 07:00 PM, jerro wrote:
> I agree, it's a good idea to ask the community for feedback. I've replied to you
> in the github thread, but maybe it would be better to continue the discussion here?
I replied in the GitHub thread before reading this email! :-P
My feeling is that the core normal RNG code is probably fairly well defined, and won't need too many more tweaks. What I see lacking right now are unittests (so far the ones I've written have just been to test that the type inference rules work correctly) and possibly also some additional necessary type constraints (e.g. it would be good to have something like isNormalRandomNumberEngine, and I don't know if you want to add some extra type checks to some of your Ziggurat-related functions).
I'd particularly appreciate it if community members could give the type checking a once-over to see if it's adequate/appropriate, whether there are any important checks missing and whether in some cases it's actually too strict (e.g. could we reasonably replace the isNumeric checks with isScalar?).
As I wrote on GitHub, I also think it's worth considering if some of the helper functions/structs introduced in these patches could better be moved out to other parts of Phobos for general-purpose use, e.g. your isPowerOfTwo.
Lastly, we need to make sure that this new functionality is well documented. Perhaps a good start here could be for people to take a look at the code and describe what isn't clear or easy to understand just from looking ... ?
Best wishes,
-- Joe
|
Copyright © 1999-2021 by the D Language Foundation