Jump to page: 1 2 3
Thread overview
opSlice as lvalue
Jul 07, 2004
Derek Parnell
Jul 07, 2004
Norbert Nemec
Jul 07, 2004
Sam McCall
Jul 07, 2004
Norbert Nemec
Jul 08, 2004
Sam McCall
Jul 07, 2004
Stephen Waits
Jul 07, 2004
Derek
Jul 07, 2004
Stephen Waits
Jul 07, 2004
Derek
Jul 07, 2004
Norbert Nemec
Jul 07, 2004
Derek
Jul 07, 2004
Regan Heath
Jul 08, 2004
Norbert Nemec
Assembler [OT - was opSlice as lvalue]
Jul 08, 2004
Arcane Jill
Jul 08, 2004
Norbert Nemec
Jul 08, 2004
Arcane Jill
Jul 08, 2004
Norbert Nemec
Jul 08, 2004
Stephen Waits
Jul 08, 2004
Stephen Waits
Jul 08, 2004
Arcane Jill
Jul 08, 2004
Derek Parnell
Jul 08, 2004
Norbert Nemec
Jul 08, 2004
Sam McCall
Jul 08, 2004
Charles Hixson
Jul 08, 2004
Charles Hixson
July 07, 2004
Hi all,
I'm trying to implement a structure in which array copying would be useful.
I expected to be able to overload this operation with a variety of
opSlice().

We can have opIndex() as both lvalue and rvalue, but this doesn't seem to
apply for opSlice.

Is there a good reason for this?

I'm now doing this operation via a roll-your-own function, but overloading would be a bit more consistent.

-- 
Derek
Melbourne, Australia
7/Jul/04 12:36:43 PM
July 07, 2004
I don't know about Walters reasoning in that respect, but I can state my position:

I think that even the opSlice that is there should be used with care. As I see it, there is no way to cleanly extend it to multiple dimensions and furthermore it collides badly with the concept vectorization That is hopefully coming into the language in the future. opIndex is not a problem, as the concept of indexing itself already breaks vectorization. Implementing slicing via opSlice, though, would always mean to create a temporary copy of the data which is killing performance.

Maybe some version of the existing read-access opSlice could still make sense in the future. Slice assignment though would - if overloadable at all - need a completely new concept.



Derek Parnell wrote:

> Hi all,
> I'm trying to implement a structure in which array copying would be
> useful. I expected to be able to overload this operation with a variety of
> opSlice().
> 
> We can have opIndex() as both lvalue and rvalue, but this doesn't seem to
> apply for opSlice.
> 
> Is there a good reason for this?
> 
> I'm now doing this operation via a roll-your-own function, but overloading would be a bit more consistent.
> 

July 07, 2004
Norbert Nemec wrote:
> I don't know about Walters reasoning in that respect, but I can state my
> position:
> 
> I think that even the opSlice that is there should be used with care. As I
> see it, there is no way to cleanly extend it to multiple dimensions 
Foo opSlice(int start1,end1,start2,end2,...)
foo[start1..end1][start2..end2]

> furthermore it collides badly with the concept vectorization That is
> hopefully coming into the language in the future. opIndex is not a problem,
> as the concept of indexing itself already breaks vectorization.
Surely being able to do something at a speed that's adequate for most applications (as evidenced by the fact that it works fine now, without vectorisation) is better than not being able to do it at all?

> Implementing slicing via opSlice, though, would always mean to create a
> temporary copy of the data which is killing performance.
Nope, if there's some underlying array, create a new object that wraps a slice of that. Or create a new object implementing the same interface, and tell it the bounds, like List.subList in java. (D still doesn't have inner classes, but they can be faked).

> Maybe some version of the existing read-access opSlice could still make
> sense in the future. Slice assignment though would - if overloadable at all
> - need a completely new concept.
I disagree. opSlice works well in almost all situations now. Yes, multidimensional extensions would be nice. The fact that opIndex is assignable but opSlice isn't is a nasty wart, if slicing is to be allowed (and it should be, to keep parity with native types[1]) then slice assignment should also be allowed.

Sam

[1]Speaking of which, delete foo[bar] for associative arrays should be overloadable (and probably renamed, but that's another story...)
> 
> 
> 
> Derek Parnell wrote:
> 
> 
>>Hi all,
>>I'm trying to implement a structure in which array copying would be
>>useful. I expected to be able to overload this operation with a variety of
>>opSlice().
>>
>>We can have opIndex() as both lvalue and rvalue, but this doesn't seem to
>>apply for opSlice.
>>
>>Is there a good reason for this?
>>
>>I'm now doing this operation via a roll-your-own function, but overloading
>>would be a bit more consistent.
>>
> 
> 
July 07, 2004
Sam McCall wrote:

> Norbert Nemec wrote:
>> I don't know about Walters reasoning in that respect, but I can state my position:
>> 
>> I think that even the opSlice that is there should be used with care. As I see it, there is no way to cleanly extend it to multiple dimensions
> Foo opSlice(int start1,end1,start2,end2,...)
> foo[start1..end1][start2..end2]

The expression
        foo[start1..end1][start2..end2]
does not do what you expect. It is parsed as
        (foo[start1..end1])[start2..end2]
where the slice resulting from the first expression is sliced again. If you
try to handle a nested array like this, you will slice the outermost
dimension twice.

There is no way to slice the inner dimension of nested arrays.

For true multidimensional arrays as they will hopefully be part of the
language sometime in the future, the slicing expression would look like:
        foo[start1..end1,start2..end2]
Defining opSlice the way you do it would certainly be possible, but it would
get rather ugly once you consider that you can also do partial slicing
        foo[idx1,start2..end2]
returning a one-dimensional array, which is different from
        foo[idx1..idx1+1,start2..end2]
which returns a two-dimensional array with .range[0]==1.

If that is not ugly enough for you than consider what happens when you
introduce striding as well:
        foo[start1..end1:stride1,start2..end2:stride2]


>> furthermore it collides badly with the concept vectorization That is hopefully coming into the language in the future. opIndex is not a problem, as the concept of indexing itself already breaks vectorization.
> Surely being able to do something at a speed that's adequate for most applications (as evidenced by the fact that it works fine now, without vectorisation) is better than not being able to do it at all?

It is not just a matter of performance but a matter of concept: what does a assignment to a slice actually mean. The the overloaded slicing has different semantics than the vectorizable slicing of arrays, then it is questionable whether it is a good idea to have it. Currently I doubt, that the semantics of the vectorizable array slicing could be captured be any kind of a opSlice function.

>> Implementing slicing via opSlice, though, would always mean to create a temporary copy of the data which is killing performance.
> Nope, if there's some underlying array, create a new object that wraps a slice of that. Or create a new object implementing the same interface, and tell it the bounds, like List.subList in java. (D still doesn't have inner classes, but they can be faked).

True, that might be possible

>> Maybe some version of the existing read-access opSlice could still make sense in the future. Slice assignment though would - if overloadable at all - need a completely new concept.
> I disagree. opSlice works well in almost all situations now. Yes, multidimensional extensions would be nice. The fact that opIndex is assignable but opSlice isn't is a nasty wart, if slicing is to be allowed (and it should be, to keep parity with native types[1]) then slice assignment should also be allowed.

I don't think it will be possible to keep that parity. At least, any solution I can think of, would be too ugly to put it into D.


July 07, 2004
On Wed, 07 Jul 2004 09:14:32 +0200, Norbert Nemec wrote:

> I don't know about Walters reasoning in that respect, but I can state my position:
> 
> I think that even the opSlice that is there should be used with care. As I see it, there is no way to cleanly extend it to multiple dimensions

So what? Single dimension is just fine for my needs. As far as I can see, a slice operation is one that is only done on a vector (single dimension array). To me is just a way of specifying a subset of vector elements.

Once one moves into multi-dimensional array operations, eg. maxtrix; a vector of vectors, we need a new syntax anyway. opSlice can remain the same (a vector slice) and D can have some new opXXX function for 'matrix slicing'.

> and furthermore it collides badly with the concept vectorization That is hopefully coming into the language in the future. opIndex is not a problem, as the concept of indexing itself already breaks vectorization.

Hmmmm..."collides"? Are there not two different paradigms at work here? Both just as valid as each other? How is this a collision?

> Implementing slicing via opSlice, though, would always mean to create a temporary copy of the data which is killing performance.

Excuse me for being a bit blunt, but "killing performance" is just overly dramatic. Is not "performance" a relative thing...something in the eye of the beholder? If the end user isn't impacted by an application's speed, who cares?

My needs are obviously different to your needs. And that's just fine. I don't tend to write CPU-intensive applications. There is no way that D's slicing operation is "killing" my apps.

> Maybe some version of the existing read-access opSlice could still make sense in the future.

Yep! - it makes sense now so why wouldn't it in the future?

> Slice assignment though would - if overloadable at all
> - need a completely new concept.

Yep! But not such a totally "new" concept, as its old hat in Basic and lots of other languages already!

  MID(myData, 4, 5) = "Derek"

or in a new D ...

   myData[4..9] = "Derek";

Not too big a stretch of the imagination, eh?

Its just bloody shorthand for ...

  myData[4] = 'D';
  myData[5] = 'e';
  myData[6] = 'r';
  myData[7] = 'e';
  myData[8] = 'k';

or ...

  source = "Derek";
  for(i=4, j=0; j < source.length; i++,j++)
     myData[i] = source[j];


and that's it! Nothing fancy. Just a syntax shorthand for updating a bunch
of adjacent vector elements.

> Derek Parnell wrote:
> 
>> Hi all,
>> I'm trying to implement a structure in which array copying would be
>> useful. I expected to be able to overload this operation with a variety of
>> opSlice().
>> 
>> We can have opIndex() as both lvalue and rvalue, but this doesn't seem to
>> apply for opSlice.
>> 
>> Is there a good reason for this?
>> 
>> I'm now doing this operation via a roll-your-own function, but overloading would be a bit more consistent.
>>

My methods currently looks a lot like the longhand above ...

    void opSlice(int i, int j, char[] x)
    {
        for (int k = 0; i < j; i++,k++)
        {
            if ((i >= 0) && (i < Elem.length) && (k < x.length))
                Elem[i]( cast(int)x[k] );
        };
    }

in anticipation of being able to write ...

   Foo[i..j] = x;

whereas I now write ...

   Foo.opSlice(i, j, x);

-- 
Derek
Melbourne, Australia
July 07, 2004
Derek wrote:
> Excuse me for being a bit blunt, but "killing performance" is just overly
> dramatic. Is not "performance" a relative thing...something in the eye of
> the beholder? If the end user isn't impacted by an application's speed, who
> cares? 

Who's really being dramatic here?  "killing performance" is the truth.

> My needs are obviously different to your needs. And that's just fine. I
> don't tend to write CPU-intensive applications. There is no way that D's
> slicing operation is "killing" my apps. 

Exactly - your needs.  However, there are lots of people who care, or even require, that generated code be as efficient as possible.

Remember, this is a general purpose language - not a language just for those who don't care about multi-dimensional array performance.

--Steve
July 07, 2004
Sam McCall wrote:
> Surely being able to do something at a speed that's adequate for most applications (as evidenced by the fact that it works fine now, without vectorisation) is better than not being able to do it at all?

But, if we could get good vectorization (in the sense that the language makes it much easier on compilers) with little compromise, wouldn't everyone agree that's best?

--Steve
July 07, 2004
On Wed, 07 Jul 2004 10:37:19 -0700, Stephen Waits wrote:

> Derek wrote:
>> Excuse me for being a bit blunt, but "killing performance" is just overly dramatic. Is not "performance" a relative thing...something in the eye of the beholder? If the end user isn't impacted by an application's speed, who cares?
> 
> Who's really being dramatic here?  "killing performance" is the truth.

In which case I have a magic machine on my desk, because D is *not* killing it's performance. To be more precise, I am not being prevented from using this machine to the extent that I need to, by D's executables.

>> My needs are obviously different to your needs. And that's just fine. I don't tend to write CPU-intensive applications. There is no way that D's slicing operation is "killing" my apps.
> 
> Exactly - your needs.  However, there are lots of people who care, or even require, that generated code be as efficient as possible.

I'm sorry. I didn't mean to imply that I didn't care about "efficient as possible". Of course a D compiler should produce efficient code. And in additional to that, D's current vector slicing is already efficient enough for me. True, it may not be efficient enough for other people, and thus a different implementation may be required. I have no problems with that either.

The implication, as I read it, in "killing performance" is that *every* instance of the use of D's current slicing is a 'bad thing'. It is with that implication that I have an issue with. For I simply have no evidence of that. In fact I have evidence of the contrary view, namely that my machine is not 'dying' under the load.

> Remember, this is a general purpose language - not a language just for those who don't care about multi-dimensional array performance.

And yet, it sounds like some are asking for special-purpose multi-dimensional array operations. I've been writing business/financial applications for 30 years and have yet to need special-purpose multi-dimensional array operations. It seems that I travel in a different world.

I do not care whether D implements multi-dimensional array operations or not. I neither argue for them nor against them. It has no impact on the types of applications that I need to write. If Walter has time to do it, then fine! I changes my world not one iota.

I was afraid that I wouldn't be very clear, and thus be misunderstood.

I was really just asking for a coding shorthand that would 'appear' like a sliced lvalue. That's all.

-- 
Derek
Melbourne, Australia
July 07, 2004
The result of my thinking about this mail can be seen in the newly started thread. Here just a few comments

Derek wrote:

> On Wed, 07 Jul 2004 09:14:32 +0200, Norbert Nemec wrote:
> Once one moves into multi-dimensional array operations, eg. maxtrix; a
> vector of vectors, we need a new syntax anyway. opSlice can remain the
> same (a vector slice) and D can have some new opXXX function for 'matrix
> slicing'.

No need for that. As you can see in the other message everything fits nicely into one concept.

>> and furthermore it collides badly with the concept vectorization That is hopefully coming into the language in the future. opIndex is not a problem, as the concept of indexing itself already breaks vectorization.
> 
> Hmmmm..."collides"? Are there not two different paradigms at work here? Both just as valid as each other? How is this a collision?

Forget my words. By now, my concept of vector expression has evolved to a point, where the collision is gone.

>> Implementing slicing via opSlice, though, would always mean to create a temporary copy of the data which is killing performance.
> 
> Excuse me for being a bit blunt, but "killing performance" is just overly dramatic. Is not "performance" a relative thing...something in the eye of the beholder? If the end user isn't impacted by an application's speed, who cares?
> 
> My needs are obviously different to your needs. And that's just fine. I don't tend to write CPU-intensive applications. There is no way that D's slicing operation is "killing" my apps.

When I talk about performance, I think about high-performance numerics. This is the area where the design of a language is most crucial. C++ for example is inherently slower than Fortran (unless you pull out some nasty tricks) because the language gets in the way of optimizing. What I am concerned about, is to avoid that kind of problems in D. Numerics are important in science (which is my field) but also in computer graphics and gaming. Anyone working in one of these fields will agree that array handling is too important to go for some quick solution.

Of course, I might be a bit overly concerned. It already turns out that the solutions get simpler and simpler the more I think about it all.

In the case at hand, I just realized how things could work together that you have all the flexibility you need for user-defined types, and all the performance you want for native arrays.

>> Slice assignment though would - if overloadable at all
>> - need a completely new concept.
> 
> Yep! But not such a totally "new" concept, as its old hat in Basic and lots of other languages already!
> 
>   MID(myData, 4, 5) = "Derek"
> 
> or in a new D ...
> 
>    myData[4..9] = "Derek";
> 
> Not too big a stretch of the imagination, eh?

True.

> Its just bloody shorthand for ...
> 
>   myData[4] = 'D';
>   myData[5] = 'e';
>   myData[6] = 'r';
>   myData[7] = 'e';
>   myData[8] = 'k';
> 
> or ...
> 
>   source = "Derek";
>   for(i=4, j=0; j < source.length; i++,j++)
>      myData[i] = source[j];

Not true. OK, I confess that this is, what the specs say, but here I strongly disagree:

The important detail about a vector assignment is, that it does not specify the order of the assignments. This is crucial for allowing the compiler to optimize aggressively. If you write a loop, the compiler might be intelligent enough in some cases to realize that the iterations are independent and can be swapped. In general, this can not always be determined, so the compiler often has to be overly careful, even though it would actually be possible to swap iterations.

Using a vector expression, the programmer indicates clearly, that the order does not matter, allowing the compiler to swap around.

For example, the two statements

        myDataA[2..5] = myDataA[3..6];
        myDataB[3..6] = myDataB[2..5];

both give undetermined results. They are legal (the compiler can, in general, not detect the overlap) but it is the programmers problem that the outcome is undefined.

As I said, the specs say something different: following them, the first would be correct, the second garbage. But in this point, the specs should definitely be corrected.

B.t.w: This would be a case, where compiler-warnings would be reasonable: if the compiler finds it, it clearly is a mistake by the programmer, but since the compiler cannot find it in general, it cannot be made an error.


July 07, 2004
On Wed, 07 Jul 2004 23:09:27 +0200, Norbert Nemec wrote:


[snip]

> When I talk about performance, I think about high-performance numerics. This is the area where the design of a language is most crucial. C++ for example is inherently slower than Fortran (unless you pull out some nasty tricks) because the language gets in the way of optimizing. What I am concerned about, is to avoid that kind of problems in D. Numerics are important in science (which is my field) but also in computer graphics and gaming. Anyone working in one of these fields will agree that array handling is too important to go for some quick solution.
> 

Ok, so you are saying that D generates sub-optimal code for these sorts of applications. So don't use D! Use assembler or some specialized programming language. If D is to be a general purpose language, one should expect that it will not specialize in every programming domain.

[snip]

>>    myData[4..9] = "Derek";
>> 
>> Not too big a stretch of the imagination, eh?
> 
> True.
> 
>> Its just bloody shorthand for ...
>> 
>>   myData[4] = 'D';
>>   myData[5] = 'e';
>>   myData[6] = 'r';
>>   myData[7] = 'e';
>>   myData[8] = 'k';
>> 
>> or ...
>> 
>>   source = "Derek";
>>   for(i=4, j=0; j < source.length; i++,j++)
>>      myData[i] = source[j];
> 
> Not true. OK, I confess that this is, what the specs say, but here I strongly disagree:

Here I see a fundamental difference in the way we are approaching this issue. I'm looking at the results, and you are looking at the process.

If I were to reword my example above (with a pedantic streak)...

Its just bloody shorthand that ensures that after the operation is complete ...

   myData[4] ends up with the value 'D'
   myData[5] ends up with the value 'e'
   myData[6] ends up with the value 'r'
   myData[7] ends up with the value 'e'
   myData[8] ends up with the value 'k'

And I don't really care how the magic happens.

> The important detail about a vector assignment is,

I'm not talking about "vector assignment". I just want a specific subset of elements to end up with the values I need them to be. Vector assignment is one of many methods to achieve that end.

> that it does not specify
> the order of the assignments. This is crucial for allowing the compiler to
> optimize aggressively. If you write a loop, the compiler might be
> intelligent enough in some cases to realize that the iterations are
> independent and can be swapped. In general, this can not always be
> determined, so the compiler often has to be overly careful, even though it
> would actually be possible to swap iterations.
> 
> Using a vector expression, the programmer indicates clearly, that the order does not matter, allowing the compiler to swap around.

Fine! Whatever! So long as I end up with the values in the right order, I don't care.

If I really needed Ferrarri fast code, I'd drop into assembler. The Porche speed of D generated code is okay for me.


-- 
Derek
Melbourne, Australia
« First   ‹ Prev
1 2 3