Thread overview | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
September 13, 2019 Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
https://getkerf.wordpress.com/2016/03/30/the-best-algorithm-no-one-knows-about/ Interesting. I think it has a place in std.random, and also it might help improve some of the existing stuff in there. |
September 14, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Friday, 13 September 2019 at 21:49:45 UTC, Andrei Alexandrescu wrote: > https://getkerf.wordpress.com/2016/03/30/the-best-algorithm-no-one-knows-about/ > > Interesting. I think it has a place in std.random, and also it might help improve some of the existing stuff in there. tsv-sample in eBay's tsv utilities contains a number of sampling implementations, including several classic algorithms. They use std.random facilities wherever possible. It does not contain an implementation of Vitter's algorithm D, because algorithm D requires knowing the record set size ahead of time. However one of the algorithms, bernoulli skip sampling, does use the "skip" mechanism listed in section 2, para 2 of Vitter's paper. The code documentation contains references for the different algorithms used. Like all the tsv-utils stuff, they are fast. No published benchmarks though. Links: * User documentation: https://github.com/eBay/tsv-utils/blob/master/docs/ToolReference.md#tsv-sample-reference * Source code: https://github.com/eBay/tsv-utils/blob/master/tsv-sample/src/tsv_utils/tsv-sample.d * Code documentation (via Adam's doc generator): https://tsv-utils.dpldocs.info/tsv_utils.tsv_sample.html --Jon |
September 14, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Friday, 13 September 2019 at 21:49:45 UTC, Andrei Alexandrescu wrote: > https://getkerf.wordpress.com/2016/03/30/the-best-algorithm-no-one-knows-about/ > > Interesting. I think it has a place in std.random, and also it might help improve some of the existing stuff in there. As far as I know, there is one in mir. http://mir.dlang.io/mir_random_algorithm.html#.VitterStrides |
September 14, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Friday, 13 September 2019 at 21:49:45 UTC, Andrei Alexandrescu wrote:
> https://getkerf.wordpress.com/2016/03/30/the-best-algorithm-no-one-knows-about/
>
> Interesting. I think it has a place in std.random, and also it might help improve some of the existing stuff in there.
Er .... isn't that exactly what IS in std.random? I ask because I was the one who implemented Vitter's algorithm there, way back in 2012 :-)
|
September 16, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | On Saturday, 14 September 2019 at 10:26:42 UTC, Joseph Rushton Wakeling wrote: > On Friday, 13 September 2019 at 21:49:45 UTC, Andrei Alexandrescu wrote: >> https://getkerf.wordpress.com/2016/03/30/the-best-algorithm-no-one-knows-about/ >> >> Interesting. I think it has a place in std.random, and also it might help improve some of the existing stuff in there. > > Er .... isn't that exactly what IS in std.random? I ask because I was the one who implemented Vitter's algorithm there, way back in 2012 :-) I mean, I'm not saying that I feel unappreciated or anything, but I even blogged about it at the time ;-) http://braingam.es/2012/07/sampling-d/ |
September 16, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | On Monday, 16 September 2019 at 22:50:56 UTC, Joseph Rushton Wakeling wrote:
> On Saturday, 14 September 2019 at 10:26:42 UTC, Joseph Rushton Wakeling wrote:
>> On Friday, 13 September 2019 at 21:49:45 UTC, Andrei Alexandrescu wrote:
>>> https://getkerf.wordpress.com/2016/03/30/the-best-algorithm-no-one-knows-about/
>>>
>>> Interesting. I think it has a place in std.random, and also it might help improve some of the existing stuff in there.
>>
>> Er .... isn't that exactly what IS in std.random? I ask because I was the one who implemented Vitter's algorithm there, way back in 2012 :-)
>
> I mean, I'm not saying that I feel unappreciated or anything, but I even blogged about it at the time ;-)
> http://braingam.es/2012/07/sampling-d/
It says Andrei even merged them...lol :) It was 7 years ago though. Sometimes I can't even remember what I had for dinner yesterday.
|
September 16, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan Marler | On Monday, 16 September 2019 at 22:56:15 UTC, Jonathan Marler wrote: > It says Andrei even merged them...lol :) He did. We even had some pre-submission discussions around it :-P https://forum.dlang.org/post/mailman.1737.1334433366.4860.digitalmars-d@puremagic.com |
September 16, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Joseph Rushton Wakeling | On 9/16/19 5:05 PM, Joseph Rushton Wakeling wrote:
> On Monday, 16 September 2019 at 22:56:15 UTC, Jonathan Marler wrote:
>> It says Andrei even merged them...lol :)
>
> He did. We even had some pre-submission discussions around it :-P
> https://forum.dlang.org/post/mailman.1737.1334433366.4860.digitalmars-d@puremagic.com
>
Ouch. Apologies and thanks again for the great work!
|
September 17, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrei Alexandrescu | On Tuesday, 17 September 2019 at 00:28:32 UTC, Andrei Alexandrescu wrote:
> On 9/16/19 5:05 PM, Joseph Rushton Wakeling wrote:
>> On Monday, 16 September 2019 at 22:56:15 UTC, Jonathan Marler wrote:
>>> It says Andrei even merged them...lol :)
>>
>> He did. We even had some pre-submission discussions around it :-P
>> https://forum.dlang.org/post/mailman.1737.1334433366.4860.digitalmars-d@puremagic.com
>>
>
> Ouch. Apologies and thanks again for the great work!
Oh, no apology needed. I'm not offended at all, just very amused :-)
I _am_ proud of that piece of work, though. It was my first contribution to D, and of all the pieces code I've ever written it's one of the few that I think still unambiguously feels like it provided lasting value. And I think your engagement with me, and help in clarifying the right things to do (together with others here on this forum) was very important both in terms of what I learned from it and also in helping me believe that I was genuinely capable of making worthwhile contributions to high-quality software projects.
So thanks for that. In retrospect it really feels like an important turning-point for me career-wise :-)
|
September 23, 2019 Re: Vitter's algorithm for random sampling | ||||
---|---|---|---|---|
| ||||
Posted in reply to Alex | On Saturday, 14 September 2019 at 06:20:30 UTC, Alex wrote: > On Friday, 13 September 2019 at 21:49:45 UTC, Andrei Alexandrescu wrote: >> https://getkerf.wordpress.com/2016/03/30/the-best-algorithm-no-one-knows-about/ >> >> Interesting. I think it has a place in std.random, and also it might help improve some of the existing stuff in there. > > As far as I know, there is one in mir. > http://mir.dlang.io/mir_random_algorithm.html#.VitterStrides more fresh link http://docs.random.dlang.io/latest/mir_random_algorithm.html |
Copyright © 1999-2021 by the D Language Foundation