View mode: basic / threaded / horizontal-split · Log in · Help
April 04, 2012
Questions about the slice operator
I understand the basic use to slice an array but what about these:

foreach(i;0..5)
    dostuff;

That works yet this does not:

foreach(i;parallel(0..5))
    dostuff;

Why not let this work? It'd seem like a natural way of writing a 
parallel loop. For some reason:

foreach(i;[0,1,2,3,4])
    dostuff;

This performs far more slowly than the first example and only as 
fast as it when parallelized with a ~150ms function for each 
iteration.

What kind of data is it and how is it behaving? If it can do what 
it does in the first example why not let it do something like 
this:

int[] arr = 0..5; //arr = [0,1,2,3,4]

One final thought- why is the array function required to convert 
a lazy Result to an eager one? If you explicitly set something 
like int[] blah = lazything why not have it silently convert 
itself?
April 04, 2012
Re: Questions about the slice operator
ixid:

> I understand the basic use to slice an array but what about 
> these:
>
> foreach(i;0..5)
>     dostuff;
>
> That works yet this does not:
>
> foreach(i;parallel(0..5))
>     dostuff;
>
> Why not let this work? It'd seem like a natural way of writing 
> a parallel loop.

The design of D language is a bit of a patchwork, it's not very 
coherent. So the ".." notation defines an iterable interval only 
in a foreach (.. is used for switch cases too, but it includes 
the closing item too).
Generally a "patchwork design" has some clear disadvantages, but 
it often has some less visible advantages too.
With other people I have suggested few times for a..b to denote a 
first-class lazy range in D, but Walter was not interested, I 
guess. I'd like this, but using iota(5) is not terrible (but keep 
in mind that iterating on an empty interval gives a different 
outcome to iterating on an empty iota. I have an open bug report 
on this).


> For some reason:
>
> foreach(i;[0,1,2,3,4])
>     dostuff;
>
> This performs far more slowly than the first example

I don't know why, but maybe the cause is that an array literal 
like that induces a heap allocation. This doesn't happen with the 
lazy 0..5 syntax.


> If it can do what it does in the first example why not let it 
> do something like this:
>
> int[] arr = 0..5; //arr = [0,1,2,3,4]

Because a..b is not a first-class interval, because lazyness and 
ranges were  introduced quite late in D and not since the 
beginning of its design, so lazy constructs are mostly 
library-defined and they don't act like built-ins.


> One final thought- why is the array function required to 
> convert a lazy Result to an eager one? If you explicitly set 
> something like int[] blah = lazything why not have it silently 
> convert itself?

Beside the answers I've already given, generally because implicit 
conversions are often a bad thing. Requiring some syntax to 
denote the lazy->eager conversion is positive, I think I don't 
know of a language that perform such conversion implicitly.

Bye,
bearophile
April 04, 2012
Re: Questions about the slice operator
Thank you, very informative as always. =)
April 04, 2012
Re: Questions about the slice operator
On Wednesday, April 04, 2012 03:29:03 ixid wrote:
> I understand the basic use to slice an array but what about these:
> 
> foreach(i;0..5)
>      dostuff;
> 
> That works yet this does not:
> 
> foreach(i;parallel(0..5))
>      dostuff;
>
> Why not let this work? It'd seem like a natural way of writing a
> parallel loop. For some reason:
> 
> foreach(i;[0,1,2,3,4])
>      dostuff;

> This performs far more slowly than the first example and only as
> fast as it when parallelized with a ~150ms function for each
> iteration.

And what would it mean in the case of parallel(0 ..5)? Notice that

foreach(i; 0 .. 5)

and

foreach(i; [0, 1, 2. 3. 4]))

mean _completely different things. The first one doesn't involve arrays it all. 
It gets lowered to something like

for(int i = 0; i < 5; ++i)

.. is _never_ used for generating an array. It's only ever used for indicating 
a range of values. If you want to generate a range, then use std.range.iota. 
.. wouldn't make sense in the contexts that you're describing. It would have 
to generate something. And if 0 .. 5 generated [0, 1, 2, 3, 4] in the general 
case, then

foreach(i; ident([0 .. 5])

would be just as inefficient as

foreach(i; [0, 1, 2, 3, 4, 5]))

even excluding the cost of ident (which presumably just returns the array).

foreach(i; 0 .. 5)

is more efficient only because it has _nothing_ to do with arrays. Generalizing 
the syntax wouldn't help at all, and if it were generalized, it would arguably 
have to be consistent in all of its uses, in which case

foreach(i; 0 .. 5)

would become identical to

foreach(i; [0, 1, 2, 3, 4])

and therefore less efficient. Generalizing .. just doesn't make sense.

> One final thought- why is the array function required to convert
> a lazy Result to an eager one? If you explicitly set something
> like int[] blah = lazything why not have it silently convert
> itself?

That would be an incredibly bad idea. Converting from a lazy range to an array 
is expensive. You have to process the entire range and allocate memory for the 
array that you're stuffing it in. Sometimes, you need to do that, but you 
certainly don't want it to happen by accident. If such conversions were 
implicit, you'd get hidden performance hits all over the place if you weren't 
really careful. And in general, D isn't big on implicit conversions anyway. 
They're useful in some cases, but they often causes bugs. So, D allows a lot 
fewer implicit conversions than C++ does, and ranges follow that pattern.

- Jonathan M Davis
April 04, 2012
Re: Questions about the slice operator
"And what would it mean in the case of parallel(0 ..5)?"

Wouldn't it be a more elegant way of doing pretty much the same 
thing as parallel(iota(0,5))? Iterating over a range and carrying 
out your parallel task with that value.
April 04, 2012
Re: Questions about the slice operator
On Wednesday, April 04, 2012 04:45:43 ixid wrote:
> "And what would it mean in the case of parallel(0 ..5)?"
> 
> Wouldn't it be a more elegant way of doing pretty much the same
> thing as parallel(iota(0,5))? Iterating over a range and carrying
> out your parallel task with that value.

1. ".." would then be doing something very different than it does in all other 
cases.

2. That's moving something into the language which works perfectly well in the 
library, and moving it into the library doesn't really buy us anything.

3. The trend is to move stuff _out_ of the language and into libraries rather 
than into the language. The overall take on it at this point (especially from 
Andrei) is that if it _can_ be done in a library, then it _should_ be done in 
the library. The language is already very powerful and is arguably overly 
complex already. So, the question at this point is very much why it should be 
in the language when it works in the library and _not_ why it's in the library 
when it could be in the language.

I can understand why you'd like to use ".." in more cases than is currently 
allowed, but given the current semantics of "..", it really wouldn't make 
sense to use it in the sort of cases that you'd like to. Even if they're 
conceptually similar, they're semantically _very_ different from the current 
use cases for "..". So, using ".." in place of iota really wouldn't be making 
the language more consistent, even if it might seem so at first glance.

- Jonathan M Davis
April 04, 2012
Re: Questions about the slice operator
Thank you, very interesting to understand a little more about 
what goes on underneath with conceptual vs semantic differences.
April 04, 2012
Re: Questions about the slice operator
On 2012-04-04 04:11, Jonathan M Davis wrote:

> foreach(i; 0 .. 5)
>
> is more efficient only because it has _nothing_ to do with arrays. Generalizing
> the syntax wouldn't help at all, and if it were generalized, it would arguably
> have to be consistent in all of its uses, in which case
>
> foreach(i; 0 .. 5)
>
> would become identical to
>
> foreach(i; [0, 1, 2, 3, 4])
>
> and therefore less efficient. Generalizing .. just doesn't make sense.

Why couldn't the .. syntax be syntax sugar for some kind of library 
implement range type, just as what is done with associative arrays.

We could implement a new library type, named "range". Looking something 
like this:

struct range
{
    size_t start;
    size_t end;
    // implement the range interface or opApply
}

range r = 1 .. 5;

The above line would be syntax sugar for:

range r = range(1, 5);

void foo (range r)
{
    foreach (e ; r) {}
}

foo(r);

This could then be taken advantage of in other parts of the language:

class A
{
    int opSlice (range r); // new syntax
    int opSlice (size_t start, size_t end); // old syntax
}

I think this would be completely backwards compatible as well.

-- 
/Jacob Carlborg
April 04, 2012
Re: Questions about the slice operator
On Wed, 04 Apr 2012 12:06:33 +0200, Jacob Carlborg <doob@me.com> wrote:

> On 2012-04-04 04:11, Jonathan M Davis wrote:
>
>> foreach(i; 0 .. 5)
>>
>> is more efficient only because it has _nothing_ to do with arrays.  
>> Generalizing
>> the syntax wouldn't help at all, and if it were generalized, it would  
>> arguably
>> have to be consistent in all of its uses, in which case
>>
>> foreach(i; 0 .. 5)
>>
>> would become identical to
>>
>> foreach(i; [0, 1, 2, 3, 4])
>>
>> and therefore less efficient. Generalizing .. just doesn't make sense.
>
> Why couldn't the .. syntax be syntax sugar for some kind of library  
> implement range type, just as what is done with associative arrays.
>
> We could implement a new library type, named "range". Looking something  
> like this:
>
> struct range
> {
>      size_t start;
>      size_t end;
>      // implement the range interface or opApply
> }
>
> range r = 1 .. 5;
>
> The above line would be syntax sugar for:
>
> range r = range(1, 5);
>
> void foo (range r)
> {
>      foreach (e ; r) {}
> }
>
> foo(r);
>
> This could then be taken advantage of in other parts of the language:
>
> class A
> {
>      int opSlice (range r); // new syntax
>      int opSlice (size_t start, size_t end); // old syntax
> }
>
> I think this would be completely backwards compatible as well.
>

And what do we do with 3..$?
April 04, 2012
Re: Questions about the slice operator
On 2012-04-04 14:16, Simen Kjærås wrote:

> And what do we do with 3..$?

Hmm, that's a good point. The best I can think of for now is to 
translate that to:

range(3, size_t.max)

Or something like:

struct range
{
    size_t start;
    size_t end;
    bool dollar; // better name is needed
}

range(3, 0, true)

-- 
/Jacob Carlborg
« First   ‹ Prev
1 2 3
Top | Discussion index | About this forum | D home