Thread overview
std.string.translate using initializing twice?
Aug 10, 2010
simendsjo
Aug 10, 2010
Jonathan M Davis
Aug 10, 2010
simendsjo
Aug 10, 2010
Jonathan M Davis
August 10, 2010
translate does this:
    bool[256] deltab; // this would make all values of deltab false as bool.init == false, right?

    deltab[] = false;

Isn't this just initializing all values of deltab to false twice..?

And my "Is this more readable?"

Original:

string translate(string s, in string transtab, in string delchars)
    in
    {
    assert(transtab.length == 256);
    }
    body
    {
    char[] r;
    int count;
    bool[256] deltab;

    deltab[] = false;
    foreach (char c; delchars)
    {
        deltab[c] = true;
    }

    count = 0;
    foreach (char c; s)
    {
        if (!deltab[c])
        count++;
        //printf("s[%d] = '%c', count = %d\n", i, s[i], count);
    }

    r = new char[count];
    count = 0;
    foreach (char c; s)
    {
        if (!deltab[c])
        {
        r[count] = transtab[c];
        count++;
        }
    }

    return assumeUnique(r);
    }



"More readable?":


string translate(string s, in string transtab, in string delchars)
in
{
    assert(transtab.length == 256);
}
body
{
    // Mark characters to delete
    bool[256] deltab;
    foreach (char c; delchars)
        deltab[c] = true;

    // Count characters to translate
    int numToTranslate;
    foreach (char c; s)
    {
        if (!deltab[c])
            numToTranslate++;
    }

    char[] result = new char[numToTranslate];

    // Translate
    int translateIndex = 0;
    foreach (char c; s)
    {
        bool mustTranslate = !deltab[c];
        if (mustTranslate)
        {
            result[translateIndex] = transtab[c];
            translateIndex++;
        }
    }

    return assumeUnique(result);
}

August 10, 2010
On Monday, August 09, 2010 17:45:07 simendsjo wrote:
> translate does this:
>      bool[256] deltab; // this would make all values of deltab false as
> bool.init == false, right?
> 
>      deltab[] = false;
> 
> Isn't this just initializing all values of deltab to false twice..?

I believe that you are correct and that the array is getting set twice.

>  [snip]...

I confess that it's entirely irrational on my part given that D is smart enough that a post-increment where the temporary is not used should be just as efficient as a pre-increment (even in the face of operator overloading - unlike C++), but it always makes me cringe to see post-increments where a pre-increment would do...

- Jonathan M Davis
August 10, 2010
On 10.08.2010 02:59, Jonathan M Davis wrote:
> On Monday, August 09, 2010 17:45:07 simendsjo wrote:
>> translate does this:
>>       bool[256] deltab; // this would make all values of deltab false as
>> bool.init == false, right?
>>
>>       deltab[] = false;
>>
>> Isn't this just initializing all values of deltab to false twice..?
>
> I believe that you are correct and that the array is getting set twice.
>
>>   [snip]...
>
> I confess that it's entirely irrational on my part given that D is smart enough
> that a post-increment where the temporary is not used should be just as efficient
> as a pre-increment (even in the face of operator overloading - unlike C++), but
> it always makes me cringe to see post-increments where a pre-increment would
> do...
>
> - Jonathan M Davis

Yeah. Don't remember when, don't remember where, but I too have read that preincrement is faster (PS: I don't know any assembler!).

As long as I don't use it in an expression, I always use post-increment as it shouldn't make a difference.
August 10, 2010
On Monday, August 09, 2010 18:03:48 simendsjo wrote:
> Yeah. Don't remember when, don't remember where, but I too have read that preincrement is faster (PS: I don't know any assembler!).
> 
> As long as I don't use it in an expression, I always use post-increment as it shouldn't make a difference.

post-increment creates a temporary. It's really doing something like

T temp = i;
++i;
//use temp in the expression with i++

The temporary is useless and pointless if i++ is by itself rather than in an expression. In the case of primitives, the compiler knows enough to optimize out the temporary, but in the case of operator overloading in C++, because pre and post-increment are overloaded separately, it can't know for sure that it's safe to do the optimization, so it doesn't do it, and your code is less efficient. That's why code like

for(vector<int>::iterator iter = vec.begin(),
                                       end  = vec.end();
      iter != end;
      iter++)
{
  ...
}

is so bad. You really need to use ++iter in that case. However, in D, this isn't a problem because pre-increment and post-increment are overloaded by one function and the compiler takes care of whether it should be pre or post when it's used. So, it should be able to do the optimization just fine. But I've programmed in C++ for so long (where it does matter), that I instinctively react negatively to post-increment where a pre-increment will do even in languages like Java (which doesn't have operator overloading) or D where it doesn't matter.

I'd argue, that everyone (at least everyone programming in C++) should just be in the habit of using pre-increment except the cases where a post-increment is necessary (then you never have to worry about whether it's less efficient to use post-increment in a particular case), but for whatever reason, new programmers are pretty much always taught post-increment first, and so that's what most programmers are used to using.

In reality, it's becoming less relevant as newer languages are designed in a manner than there is no difference in efficiency, but my natural reaction is still very much that post-increment by itself is evil.

- Jonathan M Davis