Jump to page: 1 2 3
Thread overview
Questions about the slice operator
Apr 04, 2012
ixid
Apr 04, 2012
bearophile
Apr 04, 2012
ixid
Apr 04, 2012
Jonathan M Davis
Apr 04, 2012
ixid
Apr 04, 2012
Jonathan M Davis
Apr 04, 2012
ixid
Apr 04, 2012
Jacob Carlborg
Apr 04, 2012
Simen Kjærås
Apr 04, 2012
Jacob Carlborg
Apr 04, 2012
Simen Kjærås
Apr 04, 2012
Jonathan M Davis
Apr 04, 2012
Simen Kjærås
Apr 04, 2012
Jacob Carlborg
Apr 04, 2012
Simen Kjærås
Apr 04, 2012
Jacob Carlborg
Apr 04, 2012
Jonathan M Davis
Apr 04, 2012
Jacob Carlborg
Apr 05, 2012
Christophe
Apr 05, 2012
Jonathan M Davis
Apr 05, 2012
bearophile
Apr 05, 2012
Timon Gehr
April 04, 2012
I understand the basic use to slice an array but what about these:

foreach(i;0..5)
    dostuff;

That works yet this does not:

foreach(i;parallel(0..5))
    dostuff;

Why not let this work? It'd seem like a natural way of writing a parallel loop. For some reason:

foreach(i;[0,1,2,3,4])
    dostuff;

This performs far more slowly than the first example and only as fast as it when parallelized with a ~150ms function for each iteration.

What kind of data is it and how is it behaving? If it can do what it does in the first example why not let it do something like this:

int[] arr = 0..5; //arr = [0,1,2,3,4]

One final thought- why is the array function required to convert a lazy Result to an eager one? If you explicitly set something like int[] blah = lazything why not have it silently convert itself?
April 04, 2012
ixid:

> I understand the basic use to slice an array but what about these:
>
> foreach(i;0..5)
>     dostuff;
>
> That works yet this does not:
>
> foreach(i;parallel(0..5))
>     dostuff;
>
> Why not let this work? It'd seem like a natural way of writing a parallel loop.

The design of D language is a bit of a patchwork, it's not very coherent. So the ".." notation defines an iterable interval only in a foreach (.. is used for switch cases too, but it includes the closing item too).
Generally a "patchwork design" has some clear disadvantages, but it often has some less visible advantages too.
With other people I have suggested few times for a..b to denote a first-class lazy range in D, but Walter was not interested, I guess. I'd like this, but using iota(5) is not terrible (but keep in mind that iterating on an empty interval gives a different outcome to iterating on an empty iota. I have an open bug report on this).


> For some reason:
>
> foreach(i;[0,1,2,3,4])
>     dostuff;
>
> This performs far more slowly than the first example

I don't know why, but maybe the cause is that an array literal like that induces a heap allocation. This doesn't happen with the lazy 0..5 syntax.


> If it can do what it does in the first example why not let it do something like this:
>
> int[] arr = 0..5; //arr = [0,1,2,3,4]

Because a..b is not a first-class interval, because lazyness and ranges were  introduced quite late in D and not since the beginning of its design, so lazy constructs are mostly library-defined and they don't act like built-ins.


> One final thought- why is the array function required to convert a lazy Result to an eager one? If you explicitly set something like int[] blah = lazything why not have it silently convert itself?

Beside the answers I've already given, generally because implicit conversions are often a bad thing. Requiring some syntax to denote the lazy->eager conversion is positive, I think I don't know of a language that perform such conversion implicitly.

Bye,
bearophile
April 04, 2012
Thank you, very informative as always. =)

April 04, 2012
On Wednesday, April 04, 2012 03:29:03 ixid wrote:
> I understand the basic use to slice an array but what about these:
> 
> foreach(i;0..5)
>      dostuff;
> 
> That works yet this does not:
> 
> foreach(i;parallel(0..5))
>      dostuff;
>
> Why not let this work? It'd seem like a natural way of writing a parallel loop. For some reason:
> 
> foreach(i;[0,1,2,3,4])
>      dostuff;

> This performs far more slowly than the first example and only as fast as it when parallelized with a ~150ms function for each iteration.

And what would it mean in the case of parallel(0 ..5)? Notice that

foreach(i; 0 .. 5)

and

foreach(i; [0, 1, 2. 3. 4]))

mean _completely different things. The first one doesn't involve arrays it all. It gets lowered to something like

for(int i = 0; i < 5; ++i)

.. is _never_ used for generating an array. It's only ever used for indicating a range of values. If you want to generate a range, then use std.range.iota. .. wouldn't make sense in the contexts that you're describing. It would have to generate something. And if 0 .. 5 generated [0, 1, 2, 3, 4] in the general case, then

foreach(i; ident([0 .. 5])

would be just as inefficient as

foreach(i; [0, 1, 2, 3, 4, 5]))

even excluding the cost of ident (which presumably just returns the array).

foreach(i; 0 .. 5)

is more efficient only because it has _nothing_ to do with arrays. Generalizing the syntax wouldn't help at all, and if it were generalized, it would arguably have to be consistent in all of its uses, in which case

foreach(i; 0 .. 5)

would become identical to

foreach(i; [0, 1, 2, 3, 4])

and therefore less efficient. Generalizing .. just doesn't make sense.

> One final thought- why is the array function required to convert a lazy Result to an eager one? If you explicitly set something like int[] blah = lazything why not have it silently convert itself?

That would be an incredibly bad idea. Converting from a lazy range to an array is expensive. You have to process the entire range and allocate memory for the array that you're stuffing it in. Sometimes, you need to do that, but you certainly don't want it to happen by accident. If such conversions were implicit, you'd get hidden performance hits all over the place if you weren't really careful. And in general, D isn't big on implicit conversions anyway. They're useful in some cases, but they often causes bugs. So, D allows a lot fewer implicit conversions than C++ does, and ranges follow that pattern.

- Jonathan M Davis
April 04, 2012
"And what would it mean in the case of parallel(0 ..5)?"

Wouldn't it be a more elegant way of doing pretty much the same thing as parallel(iota(0,5))? Iterating over a range and carrying out your parallel task with that value.
April 04, 2012
On Wednesday, April 04, 2012 04:45:43 ixid wrote:
> "And what would it mean in the case of parallel(0 ..5)?"
> 
> Wouldn't it be a more elegant way of doing pretty much the same
> thing as parallel(iota(0,5))? Iterating over a range and carrying
> out your parallel task with that value.

1. ".." would then be doing something very different than it does in all other cases.

2. That's moving something into the language which works perfectly well in the library, and moving it into the library doesn't really buy us anything.

3. The trend is to move stuff _out_ of the language and into libraries rather than into the language. The overall take on it at this point (especially from Andrei) is that if it _can_ be done in a library, then it _should_ be done in the library. The language is already very powerful and is arguably overly complex already. So, the question at this point is very much why it should be in the language when it works in the library and _not_ why it's in the library when it could be in the language.

I can understand why you'd like to use ".." in more cases than is currently allowed, but given the current semantics of "..", it really wouldn't make sense to use it in the sort of cases that you'd like to. Even if they're conceptually similar, they're semantically _very_ different from the current use cases for "..". So, using ".." in place of iota really wouldn't be making the language more consistent, even if it might seem so at first glance.

- Jonathan M Davis
April 04, 2012
Thank you, very interesting to understand a little more about what goes on underneath with conceptual vs semantic differences.
April 04, 2012
On 2012-04-04 04:11, Jonathan M Davis wrote:

> foreach(i; 0 .. 5)
>
> is more efficient only because it has _nothing_ to do with arrays. Generalizing
> the syntax wouldn't help at all, and if it were generalized, it would arguably
> have to be consistent in all of its uses, in which case
>
> foreach(i; 0 .. 5)
>
> would become identical to
>
> foreach(i; [0, 1, 2, 3, 4])
>
> and therefore less efficient. Generalizing .. just doesn't make sense.

Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays.

We could implement a new library type, named "range". Looking something like this:

struct range
{
    size_t start;
    size_t end;
    // implement the range interface or opApply
}

range r = 1 .. 5;

The above line would be syntax sugar for:

range r = range(1, 5);

void foo (range r)
{
    foreach (e ; r) {}
}

foo(r);

This could then be taken advantage of in other parts of the language:

class A
{
    int opSlice (range r); // new syntax
    int opSlice (size_t start, size_t end); // old syntax
}

I think this would be completely backwards compatible as well.

-- 
/Jacob Carlborg
April 04, 2012
On Wed, 04 Apr 2012 12:06:33 +0200, Jacob Carlborg <doob@me.com> wrote:

> On 2012-04-04 04:11, Jonathan M Davis wrote:
>
>> foreach(i; 0 .. 5)
>>
>> is more efficient only because it has _nothing_ to do with arrays. Generalizing
>> the syntax wouldn't help at all, and if it were generalized, it would arguably
>> have to be consistent in all of its uses, in which case
>>
>> foreach(i; 0 .. 5)
>>
>> would become identical to
>>
>> foreach(i; [0, 1, 2, 3, 4])
>>
>> and therefore less efficient. Generalizing .. just doesn't make sense.
>
> Why couldn't the .. syntax be syntax sugar for some kind of library implement range type, just as what is done with associative arrays.
>
> We could implement a new library type, named "range". Looking something like this:
>
> struct range
> {
>      size_t start;
>      size_t end;
>      // implement the range interface or opApply
> }
>
> range r = 1 .. 5;
>
> The above line would be syntax sugar for:
>
> range r = range(1, 5);
>
> void foo (range r)
> {
>      foreach (e ; r) {}
> }
>
> foo(r);
>
> This could then be taken advantage of in other parts of the language:
>
> class A
> {
>      int opSlice (range r); // new syntax
>      int opSlice (size_t start, size_t end); // old syntax
> }
>
> I think this would be completely backwards compatible as well.
>

And what do we do with 3..$?
April 04, 2012
On 2012-04-04 14:16, Simen Kjærås wrote:

> And what do we do with 3..$?

Hmm, that's a good point. The best I can think of for now is to translate that to:

range(3, size_t.max)

Or something like:

struct range
{
    size_t start;
    size_t end;
    bool dollar; // better name is needed
}

range(3, 0, true)

-- 
/Jacob Carlborg
« First   ‹ Prev
1 2 3