Thread overview
Is this a bug? RandomSample claims it needs input range but actually needs forward range
Jun 19, 2013
finalpatch
Jun 19, 2013
Jonathan M Davis
Jun 19, 2013
finalpatch
Jun 19, 2013
Jonathan M Davis
Jun 19, 2013
finalpatch
Jun 19, 2013
finalpatch
June 19, 2013
in the declaration of RandomSample it only checks for isInputRange, but in the implementation it calls the .save() method which is only available in forward ranges.

as a result this code does not compile:

void main()
{
	auto a = File("test.d", "r").byLine();
	auto s = randomSample(a,1,5);
	writeln(s);
}

Is this a bug or I missed something?
June 19, 2013
On Wednesday, June 19, 2013 03:56:43 finalpatch wrote:
> in the declaration of RandomSample it only checks for isInputRange, but in the implementation it calls the .save() method which is only available in forward ranges.
> 
> as a result this code does not compile:
> 
> void main()
> {
> auto a = File("test.d", "r").byLine();
> auto s = randomSample(a,1,5);
> writeln(s);
> }
> 
> Is this a bug or I missed something?

Any function which uses save requires at least a forward range, and if its template constraint requires only an input range, then its template constraint is wrong. _Any_ function whose template constraint does not actually guarantee that anything that passes the constraint will compile with that function is buggy.

- Jonathan M Davis
June 19, 2013
On 06/19/2013 02:56 AM, finalpatch wrote:
> in the declaration of RandomSample it only checks for isInputRange, but in the implementation it calls the .save() method which is only available in forward ranges.

What compiler and version are you using?  This was reported here: http://d.puremagic.com/issues/show_bug.cgi?id=10265

... and fixed in this pull request: https://github.com/D-Programming-Language/phobos/pull/1332
June 19, 2013
On Wednesday, 19 June 2013 at 02:32:46 UTC, Jonathan M Davis wrote:
> On Wednesday, June 19, 2013 03:56:43 finalpatch wrote:
>> in the declaration of RandomSample it only checks for
>> isInputRange, but in the implementation it calls the .save()
>> method which is only available in forward ranges.
>> 
>> as a result this code does not compile:
>> 
>> void main()
>> {
>> auto a = File("test.d", "r").byLine();
>> auto s = randomSample(a,1,5);
>> writeln(s);
>> }
>> 
>> Is this a bug or I missed something?
>
> Any function which uses save requires at least a forward range, and if its
> template constraint requires only an input range, then its template constraint
> is wrong. _Any_ function whose template constraint does not actually guarantee
> that anything that passes the constraint will compile with that function is
> buggy.
>
> - Jonathan M Davis

So you are saying this is a bug in std.random?

June 19, 2013
On Wednesday, 19 June 2013 at 03:46:11 UTC, Joseph Rushton Wakeling wrote:
> On 06/19/2013 02:56 AM, finalpatch wrote:
>> in the declaration of RandomSample it only checks for isInputRange, but in the
>> implementation it calls the .save() method which is only available in forward
>> ranges.
>
> What compiler and version are you using?  This was reported here:
> http://d.puremagic.com/issues/show_bug.cgi?id=10265
>
> ... and fixed in this pull request:
> https://github.com/D-Programming-Language/phobos/pull/1332

I was testing with DMD 2.063
June 19, 2013
On 06/19/2013 04:49 AM, finalpatch wrote:
> I was testing with DMD 2.063

Try the very latest, 2.063.2 ... ?  2.063 was released on 28 May, my fix got committed on 7 June.

June 19, 2013
On 06/19/2013 03:32 AM, Jonathan M Davis wrote:
> Any function which uses save requires at least a forward range, and if its template constraint requires only an input range, then its template constraint is wrong. _Any_ function whose template constraint does not actually guarantee that anything that passes the constraint will compile with that function is buggy.

It was a failure of testing.  Doing a bit of git archaeology, it looks like the save function was introduced first, then the isInputRange template constraints, but as the unittests all involved arrays as sample input, the clash with .save was never spotted.  Since most of the practical use cases for randomSample involve sampling either from an array or from a forward range like iota, it obviously didn't get spotted "in the wild" either.

The annoying thing is that when I first started using randomSample over a year ago, I _must_ have encountered this bug, because I wrote code trying to sample from file.byLine -- the trouble is, at the time, I didn't understand the range concept very well and so didn't realize the compilation error was the sign of a bug, I just rewrote my code to read the file into an array of strings and sample that.  (Kids, let this be a lesson -- always ask when your code won't compile and you don't understand the error message.)

It was just coincidence that last month I was discussing randomSample on D.learn and wanted to present an example of its flexibility, so revisited the idea of sampling from file.byLine and (this time) realized it shouldn't fail to compile.

Incidentally, sampling from .byLine carries a certain cost.  The algorithm used
in RandomSample is designed to sample n items from N in O(n) time and generating
O(n) random variates (i.e. scaling with sample size).  However, that relies on
the fact that one can .popFrontN(s) or .popFrontExactly(s) the input range in
O(1) time.  With .byLine it's O(s) and so the overall complexity of the
algorithm becomes O(N), that is, scaling with input size.
June 19, 2013
On Wednesday, June 19, 2013 05:48:13 finalpatch wrote:
> So you are saying this is a bug in std.random?

Yes.

- Jonathan M Davis
June 19, 2013
Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> writes:

> On 06/19/2013 04:49 AM, finalpatch wrote:
>> I was testing with DMD 2.063
>
> Try the very latest, 2.063.2 ... ?  2.063 was released on 28 May, my fix got committed on 7 June.
>

2.063.2 has the same bug. But good to know it's fixed in git master.

-- 
finalpatch