Thread overview
Contribution to cover C++11 functionality
Nov 30, 2016
Ilya Yaroshenko
Nov 30, 2016
Timon Gehr
Nov 30, 2016
jmh530
Nov 30, 2016
Ilya Yaroshenko
[OT] Naming
Nov 30, 2016
Timon Gehr
Dec 01, 2016
Jethro
November 30, 2016
Hi,

Mir Random has [1, D] 16 out of 20 [2, C++] random number distributions.

Remaining 4 are:

1. piecewise_constant_distribution
2. piecewise_linear_distribution
3. binomial_distribution
4. negative_binomial_distribution

[1] http://docs.random.dlang.io/latest/mir_random_variable.html
[2] http://en.cppreference.com/w/cpp/concept/RandomNumberDistribution

Any takers?

Thanks,
Ilya

November 30, 2016
On 30.11.2016 16:22, Ilya Yaroshenko wrote:
> Hi,
>
> Mir Random has [1, D] 16 out of 20 [2, C++] random number distributions.
>
> Remaining 4 are:
>
> 1. piecewise_constant_distribution
> 2. piecewise_linear_distribution
> 3. binomial_distribution
> 4. negative_binomial_distribution
>
> [1] http://docs.random.dlang.io/latest/mir_random_variable.html
> [2] http://en.cppreference.com/w/cpp/concept/RandomNumberDistribution
>
> Any takers?
>
> Thanks,
> Ilya
>

Unrelated question: Why are the samplers called 'random variables'?
I'd advice to consistently use the naming convention of 'Discrete' and rename the module to 'mir.random.distributions' or similar.
November 30, 2016
On Wednesday, 30 November 2016 at 20:36:34 UTC, Timon Gehr wrote:
>
> Unrelated question: Why are the samplers called 'random variables'?
> I'd advice to consistently use the naming convention of 'Discrete' and rename the module to 'mir.random.distributions' or similar.

It also could lead to confusion with distributions over vectors or matrices, such as multivariate normal or wishart.
November 30, 2016
On Wednesday, 30 November 2016 at 20:36:34 UTC, Timon Gehr wrote:
> On 30.11.2016 16:22, Ilya Yaroshenko wrote:
>> Hi,
>>
>> Mir Random has [1, D] 16 out of 20 [2, C++] random number distributions.
>>
>> Remaining 4 are:
>>
>> 1. piecewise_constant_distribution
>> 2. piecewise_linear_distribution
>> 3. binomial_distribution
>> 4. negative_binomial_distribution
>>
>> [1] http://docs.random.dlang.io/latest/mir_random_variable.html
>> [2] http://en.cppreference.com/w/cpp/concept/RandomNumberDistribution
>>
>> Any takers?
>>
>> Thanks,
>> Ilya
>>
>
> Unrelated question: Why are the samplers called 'random variables'?
> I'd advice to consistently use the naming convention of 'Discrete' and rename the module to 'mir.random.distributions' or similar.

"random distribution" is like "accidental distribution". "random variable" is much more frequently used definition is stats world (stats world != stats packages). Also this better describes what functionality provides module. "Distribution" may be used for PDF or for CDF (or their pair). "probability distribution" and "random variable" looks better (IMHO) then "random distribution", which has another meaning in stats world: a distribution, which was chosen randomly from a class of distributions. For example, variance-mean mixtures. --Ilya
November 30, 2016
On 30.11.2016 22:12, Ilya Yaroshenko wrote:
> On Wednesday, 30 November 2016 at 20:36:34 UTC, Timon Gehr wrote:
>> On 30.11.2016 16:22, Ilya Yaroshenko wrote:
>>> Hi,
>>>
>>> Mir Random has [1, D] 16 out of 20 [2, C++] random number distributions.
>>>
>>> Remaining 4 are:
>>>
>>> 1. piecewise_constant_distribution
>>> 2. piecewise_linear_distribution
>>> 3. binomial_distribution
>>> 4. negative_binomial_distribution
>>>
>>> [1] http://docs.random.dlang.io/latest/mir_random_variable.html
>>> [2] http://en.cppreference.com/w/cpp/concept/RandomNumberDistribution
>>>
>>> Any takers?
>>>
>>> Thanks,
>>> Ilya
>>>
>>
>> Unrelated question: Why are the samplers called 'random variables'?
>> I'd advice to consistently use the naming convention of 'Discrete' and
>> rename the module to 'mir.random.distributions' or similar.
>
> "random distribution" is like "accidental distribution".

I wasn't aware that you want to read your package names like that (Such a reading does not seem to be possible for the other mir packages).

> "random
> variable" is much more frequently used definition is stats world (stats
> world != stats packages).

I'm familiar with statistics, and I agree that the terminology used should be 'right'.

> Also this better describes what functionality
> provides module.

Not really. The module provides samplers/generators for random variates drawn from certain probability distributions. But even if the module name stays, the /generators/ shouldn't be called "Variable".

> "Distribution" may be used for PDF or for CDF (or their
> pair).

The distribution is the fundamental thing, the PDF/CDF characterize it.

December 01, 2016
On Wednesday, 30 November 2016 at 21:12:16 UTC, Ilya Yaroshenko wrote:
> "random distribution" is like "accidental distribution".

Not really.  I would use "randomly chosen distribution" for that.

> "random variable" is much more frequently used definition is stats world (stats world != stats packages). Also this better describes what functionality provides module. "Distribution" may be used for PDF or for CDF (or their pair). "probability distribution" and "random variable" looks better (IMHO) then "random distribution", which has another meaning in stats world: a distribution, which was chosen randomly from a class of distributions. For example, variance-mean mixtures. --Ilya

"Random variable" is obviously the strict mathematical term, but there are a few reasons why "distribution" might be a better term to use in the API:

  * many users will not be statisticians; "distribution" is likely to be
    a more easily-understood term, while "variable" may confuse some users
    since it may be mixed up with 'variable' as in a program variable;

  * outside of mathematics many researchers use the term "distribution"
    quite casually and readily;

  * the C++11 standard calls these entities distributions, so calling the
    D functionality by similar names allows for easy understanding and
    adaptation.

(Strictly speaking the C++11 standard uses 'distribution' to refer to functors that take a source of uniformly-distributed random bits as input, and use that to generate variates with other statistical properties.)
December 01, 2016
On Thursday, 1 December 2016 at 13:42:32 UTC, Joseph Rushton Wakeling wrote:
> On Wednesday, 30 November 2016 at 21:12:16 UTC, Ilya Yaroshenko wrote:
>> [...]
>
> Not really.  I would use "randomly chosen distribution" for that.
>
> [...]

There is a problem with `distribution` in that it also has other meanings. `Random variable` is pretty well established
December 01, 2016
On Thursday, 1 December 2016 at 15:28:28 UTC, Jethro wrote:
> There is a problem with `distribution` in that it also has other meanings.

Yes, but in context, is `random distribution` actually ambiguous?  What might people confuse it with?

> `Random variable` is pretty well established

But is that matching with the strictest of mathematical terminology worth it compared to matching terminology with a well established API standard?