September 19, 2011
On Mon, 19 Sep 2011 10:08:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 9/19/11 6:25 AM, Steven Schveighoffer wrote:
>> On Sun, 18 Sep 2011 15:34:16 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>>
>>> On 09/18/2011 08:28 PM, Andrei Alexandrescu wrote:
>>>> That would allow us to e.g. switch from the
>>>> pointer+length representation to the arguably better pointer+pointer
>>>> representation with ease.
>>>
>>> In what way is that representation better?
>>
>> I agree, I don't see why the representation is inherently better. Some
>> operations become higher performance (i.e. popFront), and some become
>> worse (i.e. length). Most of the others are a wash.
>
> That's where frequency of use comes into play. I'm thinking popFront would be used most often, and it touches two words.

I'm not so sure.  It's very difficult to prove this is the case (either way).

Already foreach does not use the range primitives for arrays, which is probably the most frequent user of popFront (for ranges other than arrays of course).

However, length is used frequently in slicing, or passing to underlying C or OS functions which require ptr + length.  Maybe the compiler can optimize out the calculations of length when they are just getting translated right back to pointers.  For example a = a[1..$] shouldn't have to calculate a.length, it should just be a.ptr += 5.  I also think it would be nice to have direct access to the end pointer.  Likewise, a.length -= 5 shouldn't have to calculate the original length, then subtract 5, then pass that to the setlength function.

I'm not sure if the runtime code would be better or worse if we used 2 pointers instead of ptr/length.  for sure that is where the bulk of the changes would need to be made.

The other thing that I don't like is it's much easier to construct an invalid slice with 2 pointers than it is with a ptr/length.

I suspect my opinion doesn't matter much for what we all "think" is used more frequently, but there are bigger issues to solve before we could have a library-based array type.  It might be best to leave the array as a magic type, but just change the internal representation in the compiler/runtime.

I know that printf was abused at some point for writing slices by relying on the fact that the order of length/ptr happened to be the same when you specified the length of a string as a parameter to printf, this would definitely break any code that relies on that.  Not that we need to care, but it's something to think about.

And actually, false pointers would be cut down (both pointers always point to the referenced block), which might be worth all the trouble anyways :)

-Steve
September 19, 2011
On 09/19/2011 04:08 PM, Andrei Alexandrescu wrote:
> On 9/19/11 6:25 AM, Steven Schveighoffer wrote:
>> On Sun, 18 Sep 2011 15:34:16 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>>
>>> On 09/18/2011 08:28 PM, Andrei Alexandrescu wrote:
>>>> That would allow us to e.g. switch from the
>>>> pointer+length representation to the arguably better pointer+pointer
>>>> representation with ease.
>>>
>>> In what way is that representation better?
>>
>> I agree, I don't see why the representation is inherently better. Some
>> operations become higher performance (i.e. popFront), and some become
>> worse (i.e. length). Most of the others are a wash.
>
> That's where frequency of use comes into play. I'm thinking popFront
> would be used most often, and it touches two words.
>
> Andrei

Normally, each popFront comes with an accompanying empty, and the comparison against 0 is faster after a decrement than the comparison of two arbitrary pointer values.

Now a benchmark:

import std.datetime, std.stdio;

int[100000] a;

void f1(){
    for(auto ptr=a.ptr, len=a.length+1; len; ++ptr, --len){}
}
void f2(){
    for(auto b=a.ptr, e=a.ptr+a.length; b!=e; b++){}
}

void main(){
    auto b=benchmark!(f1,f2)(100000);
    writeln(b[0].to!("seconds",double)," ", b[1].to!("seconds",double));
}

On my machine: (-O -release -inline)
4.00256 4.00099

The difference is inconceivable on my oldish Core 2 Duo processor. Yet there is a reproducible difference that indicates that you were right about the pointer-pointer representation being slightly faster for that use case.









September 19, 2011
On Mon, 19 Sep 2011 10:24:33 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:

> On 09/19/2011 04:02 PM, Steven Schveighoffer wrote:
>>
>> So I think it's not only limiting to require x.length to be $, it's very
>> wrong in some cases.
>>
>> Also, think of a string. It has no length (well technically, it does,
>> but it's not the number of elements), but it has a distinct end point. A
>> properly written string type would fail to compile if $ was s.length.
>>
>
> But you'd have to compute the length anyways in the general case:
>
> str[0..$/2];
>
> Or am I misunderstanding something?
>

That's half the string in code units, not code points.

If string was properly implemented, this would fail to compile.  $ is not the length of the string range (meaning the number of code points).  The given slice operation might actually create an invalid string.

It's tricky, because you want fast slicing, but only certain slices are valid.  I once created a string type that used a char[] as its backing, but actually implemented the limitations that std.range tries to enforce (but cannot).  It's somewhat of a compromise.  If $ was mapped to s.length, it would fail to compile, but I'm not sure what I *would* use for $.  It actually might be the code units, which would not make the above line invalid.

-Steve
September 19, 2011
On Mon, 19 Sep 2011 10:38:21 -0400, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> For example a = a[1..$] shouldn't have to calculate a.length, it should just be a.ptr += 5.

oops, victim of half-editing there :)

Meant to say a = a[5..$]

-Steve
September 19, 2011
On 09/19/2011 04:43 PM, Steven Schveighoffer wrote:
> On Mon, 19 Sep 2011 10:24:33 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>
>> On 09/19/2011 04:02 PM, Steven Schveighoffer wrote:
>>>
>>> So I think it's not only limiting to require x.length to be $, it's very
>>> wrong in some cases.
>>>
>>> Also, think of a string. It has no length (well technically, it does,
>>> but it's not the number of elements), but it has a distinct end point. A
>>> properly written string type would fail to compile if $ was s.length.
>>>
>>
>> But you'd have to compute the length anyways in the general case:
>>
>> str[0..$/2];
>>
>> Or am I misunderstanding something?
>>
>
> That's half the string in code units, not code points.
>
> If string was properly implemented, this would fail to compile. $ is not
> the length of the string range (meaning the number of code points). The
> given slice operation might actually create an invalid string.

Programmers have to be aware of that if they want efficient code that deals with unicode. I think having random access to the code units and being able to iterate per code point is fine, because it gives you the best of both worlds. Manually decoding a string and slicing it at positions that were remembered to be safe has been good enough for me, at least it is efficient.

>
> It's tricky, because you want fast slicing, but only certain slices are
> valid. I once created a string type that used a char[] as its backing,
> but actually implemented the limitations that std.range tries to enforce
> (but cannot). It's somewhat of a compromise. If $ was mapped to
> s.length, it would fail to compile, but I'm not sure what I *would* use
> for $. It actually might be the code units, which would not make the
> above line invalid.
>
> -Steve

Well it would have to be consistent for a string type that "does it right" . Either the string is indexed with units or it is indexed with code points, and the other option should be provided. Dollar should just be the length of what is used for indexing/slicing here, and having that be different from length makes for a somewhat awkward interface imho.

Btw, D double-quoted string literals let you define invalid byte sequences with eg. octal literals:
string s="\377";

What would be use cases for that? Shouldn't \377 map to the extended ascii charset instead and yield the same code point that would be given in C dq strings?




September 19, 2011
Steven Schveighoffer Wrote:

> auto set = new TreeSet!uint(1, 3, 5, 7, 9);
> assert(set.length == 5);
> auto range = set[1..set.length];
> 
> assert(walkLength(range) == 2); // probably not what you expected

Where you got that "1"?
How to slice it from begin to 7?

Looks like an operator overload abuse. Slicing is an array's feature. If a collection doesn't support array interface, it's questionable whether it should support slicing as it's defined in D. By stretching slicing to non-array collections you break its semantics. Good for voodoo, bad for engineering.
September 19, 2011
On Mon, 19 Sep 2011 11:17:01 -0400, Kagamin <spam@here.lot> wrote:

> Steven Schveighoffer Wrote:
>
>> auto set = new TreeSet!uint(1, 3, 5, 7, 9);
>> assert(set.length == 5);
>> auto range = set[1..set.length];
>>
>> assert(walkLength(range) == 2); // probably not what you expected
>
> Where you got that "1"?

1 is the first element in the set.

> How to slice it from begin to 7?

set[set.begin .. 7]

This does not include the element at position 7.  To get that element too, you have to do something funky:

auto cur = set.elemAt(7);
cur.popFront();

set[set.begin .. cur];

I haven't thought of an easy way to signify find me the element After X.

> Looks like an operator overload abuse. Slicing is an array's feature. If a collection doesn't support array interface, it's questionable whether it should support slicing as it's defined in D. By stretching slicing to non-array collections you break its semantics. Good for voodoo, bad for engineering.

I emphatically disagree.  Slicing is the method of extracting a range from a collection given two reference points.  For arrays, that happens to be indexes.  For node-based containers, it would be references to those nodes (in dcollections, those are called cursors).  I don't see why slicing should be restricted to ints, otherwise, why have opSlice compile with anything other than ints?

-Steve
September 19, 2011
On Mon, 19 Sep 2011 10:08:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> On 9/19/11 6:25 AM, Steven Schveighoffer wrote:
>> On Sun, 18 Sep 2011 15:34:16 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>>
>>> On 09/18/2011 08:28 PM, Andrei Alexandrescu wrote:
>>>> That would allow us to e.g. switch from the
>>>> pointer+length representation to the arguably better pointer+pointer
>>>> representation with ease.
>>>
>>> In what way is that representation better?
>>
>> I agree, I don't see why the representation is inherently better. Some
>> operations become higher performance (i.e. popFront), and some become
>> worse (i.e. length). Most of the others are a wash.
>
> That's where frequency of use comes into play. I'm thinking popFront
> would be used most often, and it touches two words.
>
> Andrei
>

Well, looking at the two implementations:

popFront()

if(length) {
    ptr    += T.sizeof;
    length -= 1;
}

vs

if(ptrBack - ptrFront > 0) {
    ptrFront += T.sizeof;
}

And don't forget that every popFront call will be matched by a call to empty

empty()

return length;

vs

return ptrBack - ptrFront > 0;

I see that the 'new' popFront still needs to read 2 words and results in an extra subtraction. Without the guard statements, then the instruction counts are identical, but note that the current popFront implementation uses slices which are 'guarded' and I see every reason not to change this behavior. And given how memory works, I doubt there by any advantage to 1 vs 2 writes one after another to the same cache line.

By the way, for those wondering why I didn't use 'ptrBack > ptrFront', that's because comparisons are eventually reduced to a 'Jump if (Not) Zero', and I wanted to make the amount and types of hardware instructions clear.

The elephant in the room, of course, is that length now requires a division and that popFront is actually implemented using slicing:

a = a[i .. $];

which translates into:

auto begin = i;
auto end   = length;
if(end - begin >= 0  && length - end >= 0) {
    ptr     = ptr + T.sizeof * begin;
    length  = end - begin;
}

vs

auto length = (ptrBack - ptrFront) / T.sizeof;
auto begin  = ptrFront + T.sizeof * i;
auto end    = ptrFront + T.sizeof * length;
if(end - begin >= 0  && ptrBack - end >= 0) {
    ptrFront = begin;
    ptrBack  = end;
}

And both integer multiplication and division of non-power of 2 sizes is really, really, really slow, compared to +-. Now, the division is only needed whenever 'length' is called, but the extra multiplication is required on every slice.

So, on balance, I'd say the two pointers representation is categorically worse than the fat pointer representation.
September 19, 2011
On Mon, 19 Sep 2011 11:03:15 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:

> On 09/19/2011 04:43 PM, Steven Schveighoffer wrote:
>> On Mon, 19 Sep 2011 10:24:33 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>>
>>> On 09/19/2011 04:02 PM, Steven Schveighoffer wrote:
>>>>
>>>> So I think it's not only limiting to require x.length to be $, it's very
>>>> wrong in some cases.
>>>>
>>>> Also, think of a string. It has no length (well technically, it does,
>>>> but it's not the number of elements), but it has a distinct end point. A
>>>> properly written string type would fail to compile if $ was s.length.
>>>>
>>>
>>> But you'd have to compute the length anyways in the general case:
>>>
>>> str[0..$/2];
>>>
>>> Or am I misunderstanding something?
>>>
>>
>> That's half the string in code units, not code points.
>>
>> If string was properly implemented, this would fail to compile. $ is not
>> the length of the string range (meaning the number of code points). The
>> given slice operation might actually create an invalid string.
>
> Programmers have to be aware of that if they want efficient code that deals with unicode. I think having random access to the code units and being able to iterate per code point is fine, because it gives you the best of both worlds. Manually decoding a string and slicing it at positions that were remembered to be safe has been good enough for me, at least it is efficient.

I find the same.  I don't think I've ever dealt with arbitrary math operations to do slices of strings like the above.  I only slice a string when I know the bounds are sane.

Like I said, it's a compromise.  The "right" thing to do is probably not even allow code-unit access via index (some have even argued that code-point slicing is too dangerous, because you can split a grapheme, leaving a valid, but incorrect slice of the original).

>> It's tricky, because you want fast slicing, but only certain slices are
>> valid. I once created a string type that used a char[] as its backing,
>> but actually implemented the limitations that std.range tries to enforce
>> (but cannot). It's somewhat of a compromise. If $ was mapped to
>> s.length, it would fail to compile, but I'm not sure what I *would* use
>> for $. It actually might be the code units, which would not make the
>> above line invalid.
>>
>> -Steve
>
> Well it would have to be consistent for a string type that "does it right" . Either the string is indexed with units or it is indexed with code points, and the other option should be provided. Dollar should just be the length of what is used for indexing/slicing here, and having that be different from length makes for a somewhat awkward interface imho.

Except we are defining a string as a *range* and a range's length is defined as the number of elements.

Note that hasLength!string evaluates to false in std.range.'

$ should denote the end point of the aggregate, but it does not have to be equivalent to length, or even an integer/uint.  It should just mean "end".

I also proposed a while back to have ^ denote the beginning (similar to regex) of an aggregate for aggregates that don't use 0 as the beginning, but people didn't like it :)

-Steve
September 19, 2011
On Mon, 19 Sep 2011 11:46:44 -0400, Robert Jacques <sandford@jhu.edu> wrote:

> On Mon, 19 Sep 2011 10:08:32 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> On 9/19/11 6:25 AM, Steven Schveighoffer wrote:
>>> On Sun, 18 Sep 2011 15:34:16 -0400, Timon Gehr <timon.gehr@gmx.ch> wrote:
>>>
>>>> On 09/18/2011 08:28 PM, Andrei Alexandrescu wrote:
>>>>> That would allow us to e.g. switch from the
>>>>> pointer+length representation to the arguably better pointer+pointer
>>>>> representation with ease.
>>>>
>>>> In what way is that representation better?
>>>
>>> I agree, I don't see why the representation is inherently better. Some
>>> operations become higher performance (i.e. popFront), and some become
>>> worse (i.e. length). Most of the others are a wash.
>>
>> That's where frequency of use comes into play. I'm thinking popFront
>> would be used most often, and it touches two words.
>>
>> Andrei
>>
>
> The elephant in the room, of course, is that length now requires a division and that popFront is actually implemented using slicing:
>
> a = a[i .. $];
>
> which translates into:
>
> auto begin = i;
> auto end   = length;
> if(end - begin >= 0  && length - end >= 0) {
>      ptr     = ptr + T.sizeof * begin;
>      length  = end - begin;
> }
>
> vs
>
> auto length = (ptrBack - ptrFront) / T.sizeof;
> auto begin  = ptrFront + T.sizeof * i;
> auto end    = ptrFront + T.sizeof * length;
> if(end - begin >= 0  && ptrBack - end >= 0) {
>      ptrFront = begin;
>      ptrBack  = end;
> }

I would hope something like this would be optimized by the compiler:

auto begin = ptrFront + T.sizeof * i;
if(ptrBack - begin >= 0)
   ptrFront = begin;

If not, popFront could optimize it.

Certainly, to say popFront is going to perform *worse* using a dual-pointer representation is false.  Only one calculation is needed.

-Steve