February 06, 2017
On 02/06/2017 03:00 PM, Bastiaan Veelo wrote:

> In "Numerical Recipes in C", section 1.2, Press et al. propose an easy
> solution using an offset pointer:
>
> float b[4], *bb;
> bb = b - 1;
>
> Thus bb[1] through bb[4] all exist, no space is wasted nor is there a
> run-time overhead.
>
> I have tried to adapt the internals of my struct to Press' approach, but
> it seems to me it is not that easy in D -- or I'm just not proficient
> enough. Can someone here show me how that could be done?

struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) {
    T[last - first + 1] _payload;

If the static array has to be a member like that (as opposed to a dynamic one), you need an extra member:

    T* _ptr;

Because structs cannot have default constructors, you need an init() function to establish the invariant:

    void init() {
        this._ptr = _payload.ptr - first;
    }

Of course, that relies on the fact that _payload.ptr - first is a valid address to hold in a pointer.

[OT]:
    // Commented-out as it seems dangerous because _payload does not
    // have the same semantics as StaticArray.
    //
    //  alias _payload this;

Which means, you would implement length() yourself:

    size_t length() {
        return _payload.length;
    }

Then you use _ptr when indexing:

    // Support e = arr[5];
    ref T opIndex(ptrdiff_t index) {
        assert(index >= first);
        assert(index <= last);
        return *(_ptr + index);
    }

Ali

February 07, 2017
On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:
> One reason for zero-based indexes that isn't "it's what we're all used to" is that if you used one-based indexes, you would be able to represent one fewer index than zero-based, since one of the representable values - zero - could no longer be used to represent any index.
>
> Also, it's what we're all used to, and it makes perfect sense to a lot of us, and the only times in recent memory I've ever made off-by-one errors were when I was trying to use Lua and its one-based indexing.

In the scientific programming community, languages like Matlab, Fortran, and R use one-based indexing for a very good reason. Zero-based indexing is a computer science thing that makes no sense to someone working with equations.
February 07, 2017
On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
> Does the D language set in stone that the first element of an array _has_ to be index zero?
> Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does
> seem more natural to begin a count at 1.
>
> Actually, maybe even better to allow array definitions of form
> int foo[x:y];
> (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].
>
> I think the (very old) IBM PL/I language was like this.

There is a good metaphor in multi-story buildings. Americans number the first or ground floor of a high rise as floor #1.  In the UK, their first floor is what Americans would call the 2nd floor.

But then how can you trust a group of people who drive on the wrong side of the road :)




February 07, 2017
On Monday, 6 February 2017 at 23:42:55 UTC, Ali Çehreli wrote:
> Then you use _ptr when indexing:
>
>     // Support e = arr[5];
>     ref T opIndex(ptrdiff_t index) {
>         assert(index >= first);
>         assert(index <= last);
>         return *(_ptr + index);
>     }
>
> Ali

Thank you very much for your time. Sadly this gives me an access violation. The traceback doesn't seem right though, as it seems to jump to opIndexAssign from where opIndex is used. If I'm not mistaken, I have confirmed that _ptr is a valid address, see below. We do not need to take measures against the GC?

- Bastiaan.


On Windows:

rdmd -main -unittest -debug -g source\epcompat\array.d

object.Error@(0): Access Violation
----------------
0x00402057 in void epcompat.array.__unittestL81_1() at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(88)
0x0040465C in void epcompat.array.__modtest() at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(39)
0x0040A455 in int core.runtime.runModuleUnitTests().__foreachbody1(object.ModuleInfo*)
0x0040C007 in int object.ModuleInfo.opApply(scope int delegate(object.ModuleInfo*)).__lambda2(immutable(object.ModuleInfo*))
0x00406808 in _d_run_main
0x004046D4 in main at C:\SARC\Pascal2017\D\epcompat\source\epcompat\array.d(7)
0x004225DD in mainCRTStartup
0x779C62C4 in BaseThreadInitThunk
0x77B50FD9 in RtlSubscribeWnfStateChangeNotification
0x77B50FA4 in RtlSubscribeWnfStateChangeNotification

The complete source:

module epcompat.array;

// Test with rdmd -main -unittest -debug -g source\epcompat\array.d

alias StaticArray_offset StaticArray;

/*****************
 * A fixed-length array with an index that runs from $(D_PARAM first)
 * to $(D_PARAM last) inclusive.
 *
 * Implemented by means of an offset pointer.
 */
struct StaticArray_offset(T, ptrdiff_t first, ptrdiff_t last) {
    T[last - first + 1] _payload;
    T* _ptr;

    void init() {
        assert( first < cast(size_t)_payload.ptr);              // Address space underrun.
        assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun.
        this._ptr = _payload.ptr - first;
    }

    size_t length() {
        return _payload.length;
    }

    // Support e = arr[5];
    ref T opIndex(ptrdiff_t index) {
        assert(index >= first);
        assert(index <= last);
        return *(_ptr + index);
    }

    // Support arr[5] = e;
    void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) {
        assert(index >= first);
        assert(index <= last);
        *(_ptr + index) = value;
    }   // Line 39

    // Support foreach(e; arr).
    int opApply(scope int delegate(ref T) dg)
    {
        int result = 0;

        for (int i = 0; i < _payload.length; i++)
        {
            result = dg(_payload[i]);
            if (result)
                break;
        }
        return result;
    }

    // Support foreach(i, e; arr).
    int opApply(scope int delegate(ptrdiff_t index, ref T) dg)
    {
        int result = 0;

        for (int i = first; i <= last; i++)
        {
            result = dg(i, *(_ptr + i));
            if (result)
                break;
        }
        return result;
    }

    // Write to binary file.
    void toFile(string fileName)
    {
        import std.stdio;
        auto f = File(fileName, "wb");
        if (f.tryLock)
        {
            f.rawWrite(_payload);
        }
    }
}

unittest {
    StaticArray!(int, -10, 10) arr;
    assert(arr.length == 21);

    foreach (ref e; arr)
        e = 42;

    assert(arr[-10] == 42); // Line 88
    assert(arr[0]   == 42);
    assert(arr[10]  == 42);

    foreach (i, ref e; arr)
        e = i;

    assert(arr[-10] == -10);
    assert(arr[0]   ==   0);
    assert(arr[5]   ==   5);
    assert(arr[10]  ==  10);

    arr[5] = 15;
    assert(arr[5]   == 15);
}

February 07, 2017
On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

>     void init() {
>         assert( first < cast(size_t)_payload.ptr);              //
> Address space underrun.
>         assert(-first < size_t.max - cast(size_t)_payload.ptr); //
> Address space overrun.
>         this._ptr = _payload.ptr - first;
>     }

You forgot to call that most important function. ;)

1) I don't understand the first assert there, which does not pass for me, so I commented it out.

2) May bad: init() is not a good name for a struct member, so it should be renamed:

    void initialize() {
        // assert( first < cast(size_t)_payload.ptr);              // Address space underrun.
        assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun.
        this._ptr = _payload.ptr - first;
    }

3) Instead of having to remember to call it, let's introduce a function that does it for us:

auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() {
    auto s = StaticArray!(T, first, last)();
    s.initialize();
    return s;
}

unittest {
    // StaticArray!(int, -10, 10) arr;
    auto arr = makeStaticArray!(int, -10, 10);

>     foreach (i, ref e; arr)
>         e = i;

Unrelated: That line passes because you're building 32-bits. Here is the error I got:

  Error: cannot implicitly convert expression (i) of type long to int

You can cast it:

        e = cast(int)i;

or by

        import std.conv : to;
        e = i.to!int;

Ali

February 07, 2017
Pressed send too soon, before considering your GC question.

On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:

> We do not need to take measures against the GC?

Of course we have to and the struct that I wrote is illegal because it is self-referencing through the _ptr member. (D has the right to move structs around, making my _ptr potentially pointing to an illegal address.)

This optimization cannot work if the array is a static array inside the same struct. It would work with a dynamic array but then it would probably be slower than applying the *(ptr+index) technique.

Ali

February 07, 2017
On Tuesday, 7 February 2017 at 20:28:30 UTC, Ali Çehreli wrote:
> You forgot to call that most important function. ;)

Hah of course. I assumed the name would give it some special meaning, like postblit.

> 1) I don't understand the first assert there, which does not pass for me, so I commented it out.

The intention is to check that (_payload.ptr - first) is larger than 0.

> 2) May bad: init() is not a good name for a struct member, so it should be renamed:
>
>     void initialize() {
>         // assert( first < cast(size_t)_payload.ptr);
>    // Address space underrun.
>         assert(-first < size_t.max - cast(size_t)_payload.ptr); // Address space overrun.
>         this._ptr = _payload.ptr - first;
>     }
>
> 3) Instead of having to remember to call it, let's introduce a function that does it for us:
>
> auto makeStaticArray(T, ptrdiff_t first, ptrdiff_t last)() {
>     auto s = StaticArray!(T, first, last)();
>     s.initialize();
>     return s;
> }

OK good.

> unittest {
>     // StaticArray!(int, -10, 10) arr;
>     auto arr = makeStaticArray!(int, -10, 10);
>
> >     foreach (i, ref e; arr)
> >         e = i;
>
> Unrelated: That line passes because you're building 32-bits. Here is the error I got:
>
>   Error: cannot implicitly convert expression (i) of type long to int
>
> You can cast it:
>
>         e = cast(int)i;
>
> or by
>
>         import std.conv : to;
>         e = i.to!int;

Thanks a lot for your illustrative answers, including the next one!

Bastiaan.
February 07, 2017
On Tuesday, 7 February 2017 at 20:33:35 UTC, Ali Çehreli wrote:
> On 02/07/2017 02:11 AM, Bastiaan Veelo wrote:
>
> > We do not need to take measures against the GC?
>
> Of course we have to and the struct that I wrote is illegal because it is self-referencing through the _ptr member. (D has the right to move structs around, making my _ptr potentially pointing to an illegal address.)

Gosh am I glad I brought that up, and for people like you hanging out in the learn group. Thanks for pointing this out!

> This optimization cannot work if the array is a static array inside the same struct. It would work with a dynamic array but then it would probably be slower than applying the *(ptr+index) technique.

You mean slower than the _payload[index - first] technique? Is that because of heap memory versus stack memory?

Am I correct that the reason that a dynamic array would work is because it is allocated independently from the struct, meaning that its address stays the same even if the struct is moved? We could .reserve a dynamic array in initialize() to prevent it being resized.

Thanks again.
Bastiaan.
February 07, 2017
On 02/07/2017 02:04 PM, Bastiaan Veelo wrote:

>> This optimization cannot work if the array is a static array inside
>> the same struct. It would work with a dynamic array but then it would
>> probably be slower than applying the *(ptr+index) technique.
>
> You mean slower than the _payload[index - first] technique? Is that
> because of heap memory versus stack memory?

Mandatory disclaimer: We can't be sure without testing.

Not exactly because stack versus heap because the whole object might be sitting in heap memory anyway:

alias S = StaticArray!(/* ... */);

    S[] objects;
    objects ~= S(/* ... */);

So, all of those are on the heap.

It's more about having everything at hand, near each other, close in memory. If the member is a static array, then when we have an S[], all members are there to be operated on. If the array in dynamic, then an S[] has indirections, reaching out to the heap, which may involve long-latency memory reads.

    foreach (s; objects) {
        s[42];    // May have a cache miss and read from memory
    }

In the static array case, the entire body of s is already on the cache.

On the other hand, when the objects are large, then few of those can fit in the cache. etc. One needs to profile to see what works better.

> Am I correct that the reason that a dynamic array would work is because
> it is allocated independently from the struct, meaning that its address
> stays the same even if the struct is moved?

Yes, that would be a requirement to prevent the self-reference.

> We could .reserve a dynamic
> array in initialize() to prevent it being resized.

Makes sense.

Another solution that came to my mind: You can keep the self-referencing solution but make sure that the objects are not moved after initialize():

    S[] objects;
    objects.reserve(100);
    objects.each!(o => o.initialize());
    // The objects will not move if you're careful

> Thanks again.
> Bastiaan.

Ali

February 08, 2017
Wrapup: I am going to go for the original approach of index conversion, and leaving the offset-pointer approach for what it is.

Reasons:
1) uncertain efficiency gain/loss,
2) theoretically it may fail,
3) .sizeof does not include the payload,
4) analysis of the assembler generated by our reference compiler (Prospero) shows that it also uses index conversion. (Interestingly, it does so at compile time, when possible).

Thanks for many interesting insights.

Bastiaan.
1 2 3
Next ›   Last »