September 21, 2011
On 19.09.2011 18:12, Andrei Alexandrescu wrote:
> On 9/19/11 10:46 AM, Robert Jacques wrote:
>> So, on balance, I'd say the two pointers representation is categorically
>> worse than the fat pointer representation.
>
> Benchmark. A few of your assumptions don't hold.
>
> Andrei

Note that high-performance libraries that use slices, like GMP and the many BLAS libraries, use the pointer+length representation, not pointer+pointer. They've done a lot of benchmarking on a huge range of architectures, with a large range of compilers.

The underlying reason for this, is that almost all CISC instruction sets have built-in support for pointer+length. AFAIK nothing has builtin support for ptr+ptr.

On x86, you have this wonderful [EAX+8*EBX] addressing mode, that can be used on almost every instruction, so that the calculation [addr + sz*index] takes ZERO clock cycles when sz is a power of 2.
Generally, when you supply two pointers, the optimizer will try to convert it into ptr + offset (where offset isn't bytes, it corresponds to D's length).
September 21, 2011
On 9/21/11 1:49 PM, Don wrote:
> On 19.09.2011 18:12, Andrei Alexandrescu wrote:
>> On 9/19/11 10:46 AM, Robert Jacques wrote:
>>> So, on balance, I'd say the two pointers representation is categorically
>>> worse than the fat pointer representation.
>>
>> Benchmark. A few of your assumptions don't hold.
>>
>> Andrei
>
> Note that high-performance libraries that use slices, like GMP and the
> many BLAS libraries, use the pointer+length representation, not
> pointer+pointer. They've done a lot of benchmarking on a huge range of
> architectures, with a large range of compilers.
>
> The underlying reason for this, is that almost all CISC instruction sets
> have built-in support for pointer+length. AFAIK nothing has builtin
> support for ptr+ptr.
>
> On x86, you have this wonderful [EAX+8*EBX] addressing mode, that can be
> used on almost every instruction, so that the calculation [addr +
> sz*index] takes ZERO clock cycles when sz is a power of 2.
> Generally, when you supply two pointers, the optimizer will try to
> convert it into ptr + offset (where offset isn't bytes, it corresponds
> to D's length).

To all who replied and tested - color me convinced we should keep the current state of affairs. Thanks!

Andrei
September 22, 2011
On Wed, 21 Sep 2011 16:02:53 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> On 9/21/11 1:49 PM, Don wrote:
>> On 19.09.2011 18:12, Andrei Alexandrescu wrote:
>>> On 9/19/11 10:46 AM, Robert Jacques wrote:
>>>> So, on balance, I'd say the two pointers representation is categorically
>>>> worse than the fat pointer representation.
>>>
>>> Benchmark. A few of your assumptions don't hold.
>>>
>>> Andrei
>>
>> Note that high-performance libraries that use slices, like GMP and the
>> many BLAS libraries, use the pointer+length representation, not
>> pointer+pointer. They've done a lot of benchmarking on a huge range of
>> architectures, with a large range of compilers.
>>
>> The underlying reason for this, is that almost all CISC instruction sets
>> have built-in support for pointer+length. AFAIK nothing has builtin
>> support for ptr+ptr.
>>
>> On x86, you have this wonderful [EAX+8*EBX] addressing mode, that can be
>> used on almost every instruction, so that the calculation [addr +
>> sz*index] takes ZERO clock cycles when sz is a power of 2.
>> Generally, when you supply two pointers, the optimizer will try to
>> convert it into ptr + offset (where offset isn't bytes, it corresponds
>> to D's length).
>
> To all who replied and tested - color me convinced we should keep the
> current state of affairs. Thanks!
>
> Andrei
>

No problem. Also, TDPL uses ptr+ptr for its representation. Having gone back and looked at the chapter on arrays, I think that it makes for great figures and aides the comprehension of ideas. On the other hand, a lot of programming books, e.g. Numeric Recipes in C, have done a lot of harm over the years through people copying their sub-optimal code samples/snippets. So you may want to add a sentence regarding D's actual implementation to the clause under figure 4.1 on page 98.
1 2 3 4 5 6
Next ›   Last »