March 08, 2013
> You expect that output to contain only 5 stars.

Better seen with this code:


import std.stdio, std.range, std.algorithm, std.traits, std.random;

auto distinct(Range)(Range r) if (isInputRange!Range) {
    bool[ForeachType!Range] mySet;

    return r.filter!((k) {
        if (k in mySet)
            return false;
        mySet[k] = true;
        return true;
    });
}

void main() {
    5.iota.map!((_) {
        auto x = uniform(0, 10);
        write("*");
        return x;
    }).distinct.array;
}


Bye,
bearophile
March 08, 2013
On Friday, 8 March 2013 at 22:36:35 UTC, Andrea Fontana wrote:
> On Friday, 8 March 2013 at 19:36:04 UTC, bearophile wrote:
>>> auto firstDistinct(Range)(Range r, in size_t n)
>>> if (isInputRange!Range) {
>>>   bool[ForeachType!Range] mySet;
>>>
>>>   return r.filter!((k) {
>>>       if (k in mySet)
>>>           return false;
>>>       mySet[k] = true;
>>>       return true;
>>>   }).take(n);
>>> }
>>
>> I think the standard library of Ada2012 has bounded containers, so you can use them to stack-allocate a hash set that can contain up to n items (plus some more free cells to keep its load factor low enough).
>>
>> In Rust there are fully safe stack-allocated closures. Combining the two things it's possible to create a function like that firstDistinct() with zero heap allocations, that is probably fast.
>>
>> Bye,
>> bearophile
>
> Something goes wrong by the way. I removed "n" template params and "take(n)" (i can do after distinct() call, isn't it the same?).
>
> Try this (sorted for better reading):
>
> iota(100).map!((x) => uniform(0,10))().distinct().array.sort.writeln;
> iota(100).map!((x) => uniform(0,10))().array.distinct().array.sort.writeln;
>
> output:
> [0, 0, 3, 3, 4, 4, 4, 6, 7, 8]
> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Here the answer:

auto r=iota(100).map!((x) => uniform(0,10))();
writeln(r.front," ", r.front, " ", r.front, " ", r.front);

But i think "front" was "cached", but it seems not...
	
March 08, 2013
On Friday, 8 March 2013 at 23:37:20 UTC, Andrea Fontana wrote:
> On Friday, 8 March 2013 at 22:36:35 UTC, Andrea Fontana wrote:
>> On Friday, 8 March 2013 at 19:36:04 UTC, bearophile wrote:
>>>> auto firstDistinct(Range)(Range r, in size_t n)
>>>> if (isInputRange!Range) {
>>>>  bool[ForeachType!Range] mySet;
>>>>
>>>>  return r.filter!((k) {
>>>>      if (k in mySet)
>>>>          return false;
>>>>      mySet[k] = true;
>>>>      return true;
>>>>  }).take(n);
>>>> }
>>>
>>> I think the standard library of Ada2012 has bounded containers, so you can use them to stack-allocate a hash set that can contain up to n items (plus some more free cells to keep its load factor low enough).
>>>
>>> In Rust there are fully safe stack-allocated closures. Combining the two things it's possible to create a function like that firstDistinct() with zero heap allocations, that is probably fast.
>>>
>>> Bye,
>>> bearophile
>>
>> Something goes wrong by the way. I removed "n" template params and "take(n)" (i can do after distinct() call, isn't it the same?).
>>
>> Try this (sorted for better reading):
>>
>> iota(100).map!((x) => uniform(0,10))().distinct().array.sort.writeln;
>> iota(100).map!((x) => uniform(0,10))().array.distinct().array.sort.writeln;
>>
>> output:
>> [0, 0, 3, 3, 4, 4, 4, 6, 7, 8]
>> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>
> Here the answer:
>
> auto r=iota(100).map!((x) => uniform(0,10))();
> writeln(r.front," ", r.front, " ", r.front, " ", r.front);
>
> But i think "front" was "cached", but it seems not...
> 	

This is my "struct" version that helped me to findout that problem.

auto distinct(Range)(Range rs) if (isInputRange!(Unqual!Range))
{
	return DistinctResult!Range(rs);
}

struct DistinctResult(Range) if (isInputRange!(Unqual!Range))
{
	alias Unqual!Range T;
	
	T range;
	bool[ElementType!T] justSeen;

	@property front() { return range.front; }
	@property empty() { return range.empty; }
	
	void popFront()
	{
		justSeen[range.front] = true;

		while(true)
		{
			range.popFront();
			if (range.empty || range.front !in justSeen) break;
		}
	}

}

March 08, 2013
Andrea Fontana:

> But i think "front" was "cached", but it seems not...	

The caching of map front was recently removed.


More test code:


import std.stdio, std.range, std.algorithm, std.traits, std.random;

void main() {
    5.iota.map!((_) {
        auto x = uniform(0, 10);
        write("*");
        return x;
    }).filter!(_ => true).array;
}


Prints:

**********

Bye,
bearophile
March 08, 2013
Simpler:


import std.stdio, std.range, std.algorithm;

void main() {
    iota(5).map!((x) {
        write("*");
        return x;
    }).filter!(_ => true).array;
}
March 08, 2013
Andrea Fontana:

> Here the answer:
>
> auto r=iota(100).map!((x) => uniform(0,10))();
> writeln(r.front," ", r.front, " ", r.front, " ", r.front);

I think that's not the answer. I think the answer is a bug in filter().

Bye,
bearophile
March 08, 2013
On Friday, 8 March 2013 at 23:51:48 UTC, bearophile wrote:
> Andrea Fontana:
>
>> Here the answer:
>>
>> auto r=iota(100).map!((x) => uniform(0,10))();
>> writeln(r.front," ", r.front, " ", r.front, " ", r.front);
>
> I think that's not the answer. I think the answer is a bug in filter().
>
> Bye,
> bearophile

My struct version never work if map doesn't cache. Every time I call (inside popFront() too!) front() gives different values.

If i do array.distinct() it works fine: an array is filled with random values and they will never changes again.
March 09, 2013
On Friday, 8 March 2013 at 23:44:40 UTC, bearophile wrote:
> Simpler:
>
>
> import std.stdio, std.range, std.algorithm;
>
> void main() {
>     iota(5).map!((x) {
>         write("*");
>         return x;
>     }).filter!(_ => true).array;
> }

Chech this:

5.iota.map!((_) {
		auto x = uniform(0, 1000);
			writeln(" MAP-1 = ", x);
			return x;
	}).filter!(_ => true).map!((x){ writeln(" MAP-2 = ", x); return
x; })().array.writeln;

Here some function calls front() internally (filter? map?) to
check some conditions and triggers random creation again.
March 09, 2013
This is a shortened version of the filter(), it calls pred only n times, but it calls _input.front n*2 times. If _input.front is not deterministic (that means it's not pure, as in the case of distinct filtering function) it gives wrong results:


struct FilterResult(alias pred, Range) {
    Range _input;

    this(Range r) {
        _input = r;
        while (!_input.empty && !pred(_input.front)) {
            _input.popFront();
        }
    }

    @property bool empty() { return _input.empty; }

    void popFront() {
        do {
            _input.popFront();
        } while (!_input.empty && !pred(_input.front));
    }

    @property auto ref front() {
        return _input.front;
    }
}


This isn't exactly a bug of filter(), it's a design decision of not caching _input.front.

Bye,
bearophile
March 09, 2013
On Saturday, 9 March 2013 at 01:00:26 UTC, bearophile wrote:
> This is a shortened version of the filter(), it calls pred only n times, but it calls _input.front n*2 times. If _input.front is not deterministic (that means it's not pure, as in the case of distinct filtering function) it gives wrong results:
>
>
> struct FilterResult(alias pred, Range) {
>     Range _input;
>
>     this(Range r) {
>         _input = r;
>         while (!_input.empty && !pred(_input.front)) {
>             _input.popFront();
>         }
>     }
>
>     @property bool empty() { return _input.empty; }
>
>     void popFront() {
>         do {
>             _input.popFront();
>         } while (!_input.empty && !pred(_input.front));
>     }
>
>     @property auto ref front() {
>         return _input.front;
>     }
> }
>
>
> This isn't exactly a bug of filter(), it's a design decision of not caching _input.front.
>
> Bye,
> bearophile

Not funny btw :) It's not so easy to find out. And it's not easy to write a "cached map" function if you use classes or some pointers as ElementType...