Array start index (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Learn » Array start index (page 2)

August 03, 2015

Re: Array start index

Posted by Observer
in reply to DLearner

Observer

Posted in reply to DLearner

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
> Does the D language set in stone that the first element of an array _has_ to be index zero?
> Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does
> seem more natural to begin a count at 1.
>
> Actually, maybe even better to allow array definitions of form
> int foo[x:y];
> (y >= x) creating integer variables foo[x], foo[x+1],...,foo[y].

This experiment has already been run.  Perl used to support a $[ variable to set the array base.  After experience with the confusion and problems that causes, it was finally deprecated and effectively removed from the language.  See the end paragraphs of http://perldoc.perl.org/perlvar.html and also
http://search.cpan.org/~wolfsage/perl/ext/arybase/arybase.pm for more info.

August 03, 2015

Re: Array start index

Posted by DLearner
in reply to bachmeier

DLearner

Posted in reply to bachmeier

On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:
> On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:
>
>> Adding 1-indexed arrays to the language fixes nothing. Just write your 1-indexed array type and if you enjoy using it, publish it as a library. Who knows, if demand is high it may even end up in phobos.
>
> Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.

Looks like 0-base is fixed, to avoid problems with existing code.

But nothing stops _adding_ to the language by allowing
int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., foo[y].
Plus rule that int[:y] means valid symbols are foo[1], foo[2],..., foo[y].

That way, 1-start achieved, with no conflict with existing code?

August 04, 2015

Re: Array start index

Posted by Jonathan M Davis
in reply to DLearner

Jonathan M Davis

Posted in reply to DLearner

On Monday, August 03, 2015 21:32:03 DLearner via Digitalmars-d-learn wrote:
> On Monday, 3 August 2015 at 13:45:01 UTC, bachmeier wrote:
> > On Sunday, 2 August 2015 at 21:58:48 UTC, QAston wrote:
> >
> >> Adding 1-indexed arrays to the language fixes nothing. Just write your 1-indexed array type and if you enjoy using it, publish it as a library. Who knows, if demand is high it may even end up in phobos.
> >
> > Oh, I don't think that's a good idea. It's too confusing to have more than one method of indexing within the same language. You just have to do a thorough job of testing, as the possibility of errors is something you'll have to live with, given the different design choices of different languages.
>
> Looks like 0-base is fixed, to avoid problems with existing code.
>
> But nothing stops _adding_ to the language by allowing
> int[x:y] foo to mean valid symbols are foo[x], foo[x+1],...,
> foo[y].
> Plus rule that int[:y] means valid symbols are foo[1],
> foo[2],..., foo[y].
>
> That way, 1-start achieved, with no conflict with existing code?

Almost all programming languages in heavy use at this point in time start indexing at 0. It would be highly confusing to almost all programmers out there to have 1-based indexing. In addition, having 0-based indexing actually makes checking against the end of arrays and other random-access ranges easier. You can just check against length without having to do any math. In general, I would expect 1-based indexing to _increase_ the number of off by one errors in code - both because 0-based indexing helps avoid such problems when dealing with the end of the array and more importantly, because almost everyone expects 0-based indexing.

You're really barking up the wrong tree if you're trying to get any support for 1-based indexing in D. I doubt that you will see much of anyone who thinks that it's even vaguely a good idea, and there's no way that Walter or Andrei (or probably anyone in the main dev team) who is going to agree that it's even worth considering.

I think that the reality of the matter is that if you're going to do much programming - especially if you're going to be professional programmer - you just need to get used to the idea that array indices start at 0. There are a few languages out there where they don't, but they are far from the norm.

- Jonathan M Davis

August 04, 2015

Re: Array start index

Posted by Marc Schütz
in reply to DLearner

Marc Schütz

Posted in reply to DLearner

On Saturday, 1 August 2015 at 09:35:53 UTC, DLearner wrote:
> Does the D language set in stone that the first element of an array _has_ to be index zero?
> Wouldn't starting array elements at one avoid the common 'off-by-one' logic error, it does
> seem more natural to begin a count at 1.

I, too, don't think this is a good idea in general, but I can see a few use-cases where 1-based indices may be more natural. It's easy to define a wrapper:

    struct OneBasedArray(T) {
        T[] _payload;

        alias _payload this;

        T opIndex(size_t index) {
            assert(index > 0);
            return _payload[index-1];
        }

        void opIndexAssign(U : T)(size_t index, auto ref U value) {
            assert(index > 0);
            _payload[index-1] = value;
        }
    }

    unittest {
        OneBasedArray!int arr;
        arr = [1,2,3];
        arr ~= 4;
        assert(arr.length == 4);
        assert(arr[1] == 1);
        assert(arr[2] == 2);
        assert(arr[3] == 3);
        assert(arr[4] == 4);
    }

Test with:

    rdmd -main -unittest xx.d

This can of course be easily extended to support other bases than one.

August 04, 2015

Re: Array start index

Posted by QAston
in reply to DLearner

QAston

Posted in reply to DLearner

On Monday, 3 August 2015 at 21:32:05 UTC, DLearner wrote:
> Looks like 0-base is fixed, to avoid problems with existing code.
>
> But nothing stops _adding_ to the language by allowing
> int[x:y] foo to mean valid symbols are foo[x], foo[x+1],..., foo[y].
> Plus rule that int[:y] means valid symbols are foo[1], foo[2],..., foo[y].
>
> That way, 1-start achieved, with no conflict with existing code?

There're quite a few things stopping this from being added to the language.

1. People will have to learn this new feature and it's interaction with gazillion of other D features.

2. There would be a redundancy - core language will have 2 array types while one of them can be easily implemented using the other.

3. Devs will have to maintain it - as if they don't have enough things to fix atm.

Really, this is so simple to do as a library - just use opIndex, opSlice with a template struct.

As a general rule - start asking for language features only when things can't be done without them.

February 06, 2017

Re: Array start index

Posted by Bastiaan Veelo
in reply to Marc Schütz

Bastiaan Veelo

Posted in reply to Marc Schütz

On Tuesday, 4 August 2015 at 08:18:50 UTC, Marc Schütz wrote:
>         void opIndexAssign(U : T)(size_t index, auto ref U value) {

Careful here, you've got the arguments reversed. The unit test didn't detect this because it was ambiguous. This one isn't:

    unittest {
        OneBasedArray!int arr;
        arr = [1,2,3];
        arr ~= 14;
        assert(arr.length == 4);
        assert(arr[1] == 1);
        assert(arr[2] == 2);
        assert(arr[3] == 3);
        assert(arr[4] == 14);
    }

February 06, 2017

Re: Array start index

Posted by Bastiaan Veelo
in reply to Bastiaan Veelo

Bastiaan Veelo

Posted in reply to Bastiaan Veelo

On Monday, 6 February 2017 at 14:26:35 UTC, Bastiaan Veelo wrote:
> The unit test didn't detect this because it was ambiguous.

Sorry for that misinformation. I should have said that opIndexAssign wasn't tested. Here is a better test.

    unittest {
        OneBasedArray!int arr;
        arr = [1,2,3];
        arr ~= 4;
        arr[4] = 14;
        assert(arr.length == 4);
        assert(arr[1] == 1);
        assert(arr[2] == 2);
        assert(arr[3] == 3);
        assert(arr[4] == 14);
    }

February 06, 2017

Re: Array start index

Posted by pineapple
in reply to Bastiaan Veelo

pineapple

Posted in reply to Bastiaan Veelo

One reason for zero-based indexes that isn't "it's what we're all used to" is that if you used one-based indexes, you would be able to represent one fewer index than zero-based, since one of the representable values - zero - could no longer be used to represent any index.

Also, it's what we're all used to, and it makes perfect sense to a lot of us, and the only times in recent memory I've ever made off-by-one errors were when I was trying to use Lua and its one-based indexing.

February 06, 2017

Re: Array start index

Posted by Nemanja Boric
in reply to pineapple

Nemanja Boric

Posted in reply to pineapple

On Monday, 6 February 2017 at 18:55:19 UTC, pineapple wrote:
> One reason for zero-based indexes that isn't "it's what we're all used to" is that if you used one-based indexes, you would be able to represent one fewer index than zero-based, since one of the representable values - zero - could no longer be used to represent any index.
>
> Also, it's what we're all used to, and it makes perfect sense to a lot of us, and the only times in recent memory I've ever made off-by-one errors were when I was trying to use Lua and its one-based indexing.

Related:

https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html

February 06, 2017

Re: Array start index

Posted by Bastiaan Veelo
in reply to Marc Schütz

Bastiaan Veelo

Posted in reply to Marc Schütz

(There is an honest question in the end, please read on.)

All good reasons set aside, both in favour and against 0-based arrays, the only reason that is relevant to me right now is that we are seriously looking into translating close to a million lines of foreign code to D, from a language that supports arrays over arbitrary ranges of indices (positive and negative). Adapting this much program logic to a different array base is out of the question; this is an engineering application and lives are at stake. So let's not bring the base-arguments back into this sub-thread but focus on a performant solution.

Expanding on Marc's outset, I now have:

/*****************
 * A fixed-length array with an index that runs from $(D_PARAM first)
 * to $(D_PARAM last) inclusive.
 *
 * Indices are converted, which involves a small overhead.
 */
struct StaticArray(T, ptrdiff_t first, ptrdiff_t last) {
    T[last - first + 1] _payload;

    alias _payload this;

    // Support e = arr[5];
    ref T opIndex(ptrdiff_t index) {
        assert(index >= first);
        assert(index <= last);
        return _payload[index - first];
    }

    // Support arr[5] = e;
    void opIndexAssign(U : T)(auto ref U value, ptrdiff_t index) {
        assert(index >= first);
        assert(index <= last);
        _payload[index - first] = value;
    }

    // Support foreach(e; arr).
    int opApply(scope int delegate(ref T) dg)
    {
        int result = 0;

        for (int i = 0; i < _payload.length; i++)
        {
            result = dg(_payload[i]);
            if (result)
                break;
        }
        return result;
    }

    // Support foreach(i, e; arr).
    int opApply(scope int delegate(ptrdiff_t index, ref T) dg)
    {
        int result = 0;

        for (int i = 0; i < _payload.length; i++)
        {
            result = dg(i + first, _payload[i]);
            if (result)
                break;
        }
        return result;
    }

    // Write to binary file.
    void toFile(string fileName)
    {
        import std.stdio;
        auto f = File(fileName, "wb");
        if (f.tryLock)
        {
            f.rawWrite(_payload);
        }
    }
}

unittest {
    StaticArray!(int, -10, 10) arr;
    assert(arr.length == 21);

    foreach (ref e; arr)
        e = 42;
    assert(arr[-10] == 42);
    assert(arr[0]   == 42);
    assert(arr[10]  == 42);

    foreach (i, ref e; arr)
        e = i;
    assert(arr[-10] == -10);
    assert(arr[0]   ==   0);
    assert(arr[5]   ==   5);
    assert(arr[10]  ==  10);

    arr[5] = 15;
    assert(arr[5]   == 15);
}

//////////////

(The first and last indices probably don't need to be template arguments, they could possibly be immutable members of the struct; But that is not what worries me now.)
The thing is that a small overhead is paid in opIndex() and opIndexAssign() (the other member functions are fine). I'd like to get rid of that overhead.

In "Numerical Recipes in C", section 1.2, Press et al. propose an easy solution using an offset pointer:

float b[4], *bb;
bb = b - 1;

Thus bb[1] through bb[4] all exist, no space is wasted nor is there a run-time overhead.

I have tried to adapt the internals of my struct to Press' approach, but it seems to me it is not that easy in D -- or I'm just not proficient enough. Can someone here show me how that could be done?

Thanks!

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation