July 10, 2012
On 7/10/12 11:17 AM, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 10:21:23 Andrei Alexandrescu wrote:
>> On 7/10/12 5:35 AM, Jacob Carlborg wrote:
>>> On 2012-07-10 08:59, Dmitry Olshansky wrote:
>>>> Can you do it in other languages?
>>>
>>> Sure, in Ruby, but that only works on arrays:
>>>
>>> p [5, 3, 5, 6, 8].uniq.map{ |e| e.to_s }.sort
>>
>> This is very inefficient.
>>
>> I agree that if efficiency wasn't a concern for std.algorithm, its API
>> would have been different. As things are, I think std.algorithm strikes
>> a very good balance between efficiency and usability.
>
> The other thing that affects a lot is infinite ranges. The functions with lazy
> results work wonderfully with infinite ranges but would generally result in
> infinite loops if used with functions with eager results.
>
> auto vals = map!"a % 10"(rndGen());
>
> works great, but you'd be forced to use take directly on rndGen pretty much
> all the time you wanted to do something like that if functions like map were
> lazy.
>
> But I suspect that the sort of people who will be complaining about map not
> returning an array are also the sort of people who won't be all that familar
> with operating on infinite lists and at least initially probably won't care.
>
> - Jonathan M Davis

It's also about the unnecessary work done eagerly on finite but long inputs.

Andrei
July 10, 2012
Jacob Carlborg , dans le message (digitalmars.D:171690), a écrit :
>> int[] arr = [ 1, 2, 2, 2, 2, 3, 4, 4, 4, 5 ];
>> assert(equal(uniq(arr), [ 1, 2, 3, 4, 5 ][]));
> 
> How should I know that from the example?


Maybe there should be an example with an unsorted range, and a better explanation:

| auto uniq(...)
|   Iterates unique consecutive elements of a given range (...)
|   Note that equivalent elements are kept if they are not consecutive.
|
| Example:
|   int[] arr = [ 1, 2, 2, 3, 4, 4, 4, 2, 4, 4];
|   assert(equal(uniq(arr), [ 1, 2, 3, 4, 2, 4][]));
July 10, 2012
On 7/10/12 11:11 AM, Christophe Travert wrote:
> If you do not want the heap allocation of the array, you can create a
> one-element range to feed to chain (maybe such a thing could be placed
> in phobos, next to takeOne).
>
> struct OneElementRange(E)
> {
>    E elem;
>    bool passed;
>    @property ref E front() { return elem; }
>    void popFront() { passed = true; }
>    @property bool empty() { return passed; }
>    @property size_t length() { return 1-passed; }
>    //...
> }

Yah, probably we should add something like this:

auto singletonRange(E)(E value)
{
    return repeat(value).takeExactly(1);
}

I don't think it would be considerably less efficient than a handwritten specialization. But then I've been wrong before in assessing efficiency.


Andrei
July 10, 2012
On Tuesday, July 10, 2012 15:21:14 Christophe Travert wrote:
> "Simen Kjaeraas" , dans le message (digitalmars.D:171678), a écrit :
> >> Well, I haven't been able to use a single function from std.algorithm without adding a lot of calls to "array" or "to!(string)". I think the things I'm trying to do seems trivial and quite common. I'm I overrating std.algorithm or does it not fit my needs?
> > 
> > bearophile (who else? :p) has suggested the addition of eager and in-place versions of some ranges, and I think he has a very good point.
> 
> That would have been useful before UFSC.
> Now, writing .array() at the end of an algorithm call is not a pain.
> 
> int[] = [1, 2, 2, 3].uniq().map!toString().array();

It's not like

auto result = array(map!"to!string(a)"(uniq([1, 2, 2, 3])));

is a pain either. It's just ordered differently.

But regardless, it's easy to get an eager result by calling array on a lazy range, so there's no point in adding eager versions. They'd just be duplicating code for no real benefit. Not to mention, if anyone having to call array on ranges all of the time, you should probably rethink how they're using ranges. It's definitely necessary some of the time, but most of the time, you can just operate on the lazy ranges and save yourself unnecessary allocations.

- Jonathan M Davis
July 10, 2012
"Andrei Alexandrescu" <SeeWebsiteForEmail@erdani.org> wrote in message news:jthiba$2cg0$1@digitalmars.com...
> On 7/10/12 11:11 AM, Christophe Travert wrote:
>> If you do not want the heap allocation of the array, you can create a one-element range to feed to chain (maybe such a thing could be placed in phobos, next to takeOne).
>>
>> struct OneElementRange(E)
>> {
>>    E elem;
>>    bool passed;
>>    @property ref E front() { return elem; }
>>    void popFront() { passed = true; }
>>    @property bool empty() { return passed; }
>>    @property size_t length() { return 1-passed; }
>>    //...
>> }
>
> Yah, probably we should add something like this:
>
> auto singletonRange(E)(E value)
> {
>     return repeat(value).takeExactly(1);
> }
>
> I don't think it would be considerably less efficient than a handwritten specialization. But then I've been wrong before in assessing efficiency.
>
>
> Andrei

Could it be extended to accept multiple values? (sort of like chain)
eg.
foreach(x; makeRange(23, 7, 1990)) // NO allocations!
{
    ....
}
I would use this in a lot of places I currently jump through hoops to get a
static array without allocating.


July 10, 2012
Andrei Alexandrescu , dans le message (digitalmars.D:171717), a écrit :
> On 7/10/12 11:11 AM, Christophe Travert wrote:
>> If you do not want the heap allocation of the array, you can create a one-element range to feed to chain (maybe such a thing could be placed in phobos, next to takeOne).
>>
>> struct OneElementRange(E)
>> {
>>    E elem;
>>    bool passed;
>>    @property ref E front() { return elem; }
>>    void popFront() { passed = true; }
>>    @property bool empty() { return passed; }
>>    @property size_t length() { return 1-passed; }
>>    //...
>> }
> 
> Yah, probably we should add something like this:
> 
> auto singletonRange(E)(E value)
> {
>      return repeat(value).takeExactly(1);
> }

It would be much better to use:

auto singletonRange(E)(E value)
{
     return repeat(value).takeOne;
}

as well as:

auto emptyRange(E)(E value)
{
  return repeat(value).takeNone;
}

to have the advantages of takeOne and takeNone over takeExactly.

> I don't think it would be considerably less efficient than a handwritten specialization. But then I've been wrong before in assessing efficiency.

Error message displaying the type of singletonRange(E) will be weird, but that's far from being the first place where it will be. Simplicity and maintainance of phobos seems more important to me. At least until these algorithm get stable, meaning open bug reports on algorithm and range are solved, and new bugs appears rarely. Optimisers should have no trouble inlining calls to Repeat's methods...


July 10, 2012
On 7/10/12 12:01 PM, Christophe Travert wrote:
> Andrei Alexandrescu , dans le message (digitalmars.D:171717), a écrit :
>> auto singletonRange(E)(E value)
>> {
>>       return repeat(value).takeExactly(1);
>> }
>
> It would be much better to use:
>
> auto singletonRange(E)(E value)
> {
>       return repeat(value).takeOne;
> }
>
> as well as:
>
> auto emptyRange(E)(E value)
> {
>    return repeat(value).takeNone;
> }
>
> to have the advantages of takeOne and takeNone over takeExactly.

Ah, such as them being random access. Cool!

That also seems to answer Jonathan's quest about defining emptyRange. Just use takeNone(R.init).


Andrei
July 10, 2012
On 2012-07-10 16:49, Dmitry Olshansky wrote:

> I'm not sure how map can fit there. If anything you need a foreach (and
> you have it;) ) to build a _string_. I'd rather output ',' right there
> inside of loop. Thus avoiding costly join and appending to a new buffer
> on each iteration.

Sure if remove the call to "join" and build a single string from the beginning, then a foreach would be the right approach.

But if you look at the code as it is now the foreach-loop maps an array of "Parameter" to an array of "string".

> Speed and generality. Think removing temporary arrays. And while a lot
> of folks won't every use things other then arrays power users sure as
> hell would.

I have only been able to remove very few temporary arrays.

-- 
/Jacob Carlborg


July 10, 2012
On 2012-07-10 17:11, Christophe Travert wrote:

> What is wrong with foo.chain(["bar"])?

I think it conceptually wrong for what I want to do. I don't know if I misunderstood ranges completely but I'm seeing them as an abstraction over a collection. With most mutable collection you can add/append an element.

> you might try this (untested)
>
>
> string function(Parameter) stringify = (x)
> {
>   return (x.isConst? "const("~x.type~")": x.type)
>      ~ (x.name.any?" "~translateIdentifier(x.name):"");
> }
>
> auto params = parameters
>    .map!stringify()
>    .chain(variadic? []: ["..."])
>    .joiner(", ");
>
> context ~= params;
>
> I am not sure this will be more efficient. joiner may be slowed down by
> the fact that it is called with a chain result, which is slower on
> front. But at leat you save yourself the heap-allocation of the params
> array*.
>
> I would use:
> context ~= parameters.map!stringify().joiner(",  ");
> if (variadic) context ~= ", ...";
>
> To make the best implementation would require to know how the String
> context works.
>
> *Note that here, stringify is not lazy, and thus allocates. It
> could be a chain or a joiner, but I'm not sure the result would really
> be more efficient.

String is a wrapper around str.array.Appender.

-- 
/Jacob Carlborg


July 10, 2012
On 2012-07-10 17:30, Andrei Alexandrescu wrote:

> It would be possible to chain a single element to the end of a range.
> One related thing to do is to define singletonRange(x) that returns a
> range with exactly one element, namely x.

Ok, I saw that suggestion in another post.

> Ranges and std.algorithm obey simple, mathematical realities deriving
> from algorithm and storage topology constraints. For example sort works
> in place so it generally can't work on mapped stuff because there's no
> place to put the sorted contents. Also sort needs a random-access range
> to work with so it can't work e.g. with the result of filter, which
> necessarily yields a non-random-access range. And so on.

Can't "map" and "filter" return a random-access range if that's what they receive?

> Now I understand if you come from a place where there's no concern over
> hidden allocations and extra work for the benefit of convenience, you
> may find std.algorithm less easy to work with. But drawing the
> conclusion that std.algorithm is badly designed or gratuitously
> difficult to use would be a mistake. I opine I can recognize a good vs.
> bad design even when it's mine, and in my opinion std.algorithm is a
> good design and that most of your opposing impressions derive from a
> misunderstanding of its charter.

I don't think I've struggled as much with any other API I've used. In many cases I had to resort to foreach-loops because that was more convenient than using std.algorithm.

-- 
/Jacob Carlborg