Thread overview
Length comparison
Dec 05, 2006
Bill Baxter
Dec 05, 2006
xs0
Dec 05, 2006
Lionello Lunesu
Dec 05, 2006
Don Clugston
Dec 05, 2006
Frits van Bommel
Dec 05, 2006
Don Clugston
Dec 05, 2006
Frits van Bommel
December 05, 2006
This bit me just now:

    if (-1 < somearray.length) {
    }

Ouch.  Took me a while to figure out that that bit of code was the problem.

Is there some way to make these kind of bugs less likely or easier to find?

--bb
December 05, 2006
Bill Baxter wrote:
> This bit me just now:
> 
>     if (-1 < somearray.length) {
>     }
> 
> Ouch.  Took me a while to figure out that that bit of code was the problem.
> 
> Is there some way to make these kind of bugs less likely or easier to find?

It should be made illegal :) In particular:
- comparing a negative constant with an unsigned variable
- comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int)
- setting an unsigned var to a negative constant (uint a = -1)

For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:

int a;
uint b;

a < b  should be  a < 0 || cast(uint)a < b
a > b  should be  a > 0 && cast(uint)a > b
etc.etc.

I'm quite sure someone will say that's much slower so shouldn't be done, but:
- you can do it properly and only compare variables of the same type in the first place
- you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison
- I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference

Thoughts? :)


xs0
December 05, 2006
xs0 wrote:
> Bill Baxter wrote:
>> This bit me just now:
>>
>>     if (-1 < somearray.length) {
>>     }
>>
>> Ouch.  Took me a while to figure out that that bit of code was the problem.
>>
>> Is there some way to make these kind of bugs less likely or easier to find?
> 
> It should be made illegal :) In particular:
> - comparing a negative constant with an unsigned variable
> - comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int)
> - setting an unsigned var to a negative constant (uint a = -1)
> 
> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
> 
> int a;
> uint b;
> 
> a < b  should be  a < 0 || cast(uint)a < b
> a > b  should be  a > 0 && cast(uint)a > b
> etc.etc.

http://d.puremagic.com/issues/show_bug.cgi?id=259

Could you please add your remarks to that bugzilla issue as well? Especially the proper implementation for mixed comparison. I agree that the result should at least be correct, no matter* how slow.

L.

* of course, within limits :)
December 05, 2006
xs0 wrote:
> Bill Baxter wrote:
>> This bit me just now:
>>
>>     if (-1 < somearray.length) {
>>     }
>>
>> Ouch.  Took me a while to figure out that that bit of code was the problem.
>>
>> Is there some way to make these kind of bugs less likely or easier to find?
> 
> It should be made illegal :) In particular:
> - comparing a negative constant with an unsigned variable
> - comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int)
> - setting an unsigned var to a negative constant (uint a = -1)
> 
> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
> 
> int a;
> uint b;
> 
> a < b  should be  a < 0 || cast(uint)a < b
> a > b  should be  a > 0 && cast(uint)a > b
> etc.etc.
> 
> I'm quite sure someone will say that's much slower so shouldn't be done, but:
> - you can do it properly and only compare variables of the same type in the first place
> - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison

But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b  !!!
You've introduced a bug.

> - I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference

I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.

So you have to declare such variables as either 'int' (implying that they can in theory be <0) or as 'uint' (implying that that they can in theory be >int.max). Neither choice is good. There's no way to "do it properly". It's a personal tradeoff which one you use.

And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.
December 05, 2006
Don Clugston wrote:
> xs0 wrote:
<snip>
>> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
>>
>> int a;
>> uint b;
>>
>> a < b  should be  a < 0 || cast(uint)a < b
>> a > b  should be  a > 0 && cast(uint)a > b
>> etc.etc.
>>
>> I'm quite sure someone will say that's much slower so shouldn't be done, but:
>> - you can do it properly and only compare variables of the same type in the first place
>> - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison
> 
> But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b  !!!
> You've introduced a bug.

I think his point was that the compiler should perform this transformation automatically, i.e. compile "a < b" as if it said "(a < 0 || cast(uint)a < b)".
Extrapolating, if 'a' was changed to a long, the compiler should then compile it as if transformed as above, except with the cast changed to "cast(ulong)".
So yes, if done manually this introduces a bug, but if that just becomes the new meaning of comparing a signed and unsigned integer then it should work without a problem. I think this is a good idea.

>> - I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference
> 
> I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.

If opCast could be overloaded this might be possible with a struct, but it wouldn't be pretty (at least, not without also needing implicit casts and overloaded opAssign).
But AFAIK as it is, in the current language you're right that this can't be done.

> So you have to declare such variables as either 'int' (implying that they can in theory be <0) or as 'uint' (implying that that they can in theory be >int.max). Neither choice is good. There's no way to "do it properly". It's a personal tradeoff which one you use.
> 
> And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.

Yes it would be a good thing if this was detected.
IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).
December 05, 2006
Frits van Bommel wrote:
> Don Clugston wrote:
>> xs0 wrote:
> <snip>
>>> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
>>>
>>> int a;
>>> uint b;
>>>
>>> a < b  should be  a < 0 || cast(uint)a < b
>>> a > b  should be  a > 0 && cast(uint)a > b
>>> etc.etc.
>>>
>>> I'm quite sure someone will say that's much slower so shouldn't be done, but:
>>> - you can do it properly and only compare variables of the same type in the first place
>>> - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison
>>
>> But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b  !!!
>> You've introduced a bug.
> 
> I think his point was that the compiler should perform this transformation automatically, i.e. compile "a < b" as if it said "(a < 0 || cast(uint)a < b)".
> Extrapolating, if 'a' was changed to a long, the compiler should then compile it as if transformed as above, except with the cast changed to "cast(ulong)".
> So yes, if done manually this introduces a bug, but if that just becomes the new meaning of comparing a signed and unsigned integer then it should work without a problem. I think this is a good idea.
> 
>>> - I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference
>>
>> I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.
> 
> If opCast could be overloaded this might be possible with a struct, but it wouldn't be pretty (at least, not without also needing implicit casts and overloaded opAssign).
> But AFAIK as it is, in the current language you're right that this can't be done.
> 
>> So you have to declare such variables as either 'int' (implying that they can in theory be <0) or as 'uint' (implying that that they can in theory be >int.max). Neither choice is good. There's no way to "do it properly". It's a personal tradeoff which one you use.
>>
>> And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.
> 
> Yes it would be a good thing if this was detected.
> IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).

But if a mismatch was detected, would there be any way to request a fast comparison without introducing bugs?
December 05, 2006
Don Clugston wrote:
> Frits van Bommel wrote:
>> Don Clugston wrote:
<snip>
>>> And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.
>>
>> Yes it would be a good thing if this was detected.
>> IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).
> 
> But if a mismatch was detected, would there be any way to request a fast comparison without introducing bugs?

Depends on what you mean by fast and what you mean by bugs ;).

Hmm...
If "a < 0 || cast(uint)a < b" isn't fast enough for your liking, then probably the only way to make it faster would be to eliminate the "a < 0" part, which would require writing out "cast(uint)a < b" in full. That  might introduce bugs if the type of a was changed to something that can have values bigger than an uint can hold.
So yeah, that might be a problem.

The reality of it is of course that this change probably won't happen anyway, because of Walter's whole "if it looks like C, it behaves like C" philosophy. If he holds to that the best we can probably hope for is some way to express this that doesn't look like C. Maybe some new operators for these "safe comparison" operations...
Unless... What does the C standard say about comparing negative signed numbers to unsigned numbers? If the result is Undefined(TM) (or implementation-defined) then defining it in D would not be limited by this :).

Forgive me if this is all a bit too much stream-of-consciousness[1], but writing about things is a good way to think about them :).


[1]: Is that the right term? Not sure. Not a term I use a lot ;) (especially in English)