Length comparison - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Length comparison

Thread overview

Length comparison
Dec 05, 2006 Bill Baxter
Dec 05, 2006 xs0
Dec 05, 2006 Lionello Lunesu
Dec 05, 2006 Don Clugston
Dec 05, 2006 Frits van Bommel
Dec 05, 2006 Don Clugston
Dec 05, 2006 Frits van Bommel

December 05, 2006

Length comparison

Posted by Bill Baxter

Bill Baxter

This bit me just now:

    if (-1 < somearray.length) {
    }

Ouch.  Took me a while to figure out that that bit of code was the problem.

Is there some way to make these kind of bugs less likely or easier to find?

--bb

December 05, 2006

Re: Length comparison

Posted by xs0
in reply to Bill Baxter

xs0

Posted in reply to Bill Baxter

Bill Baxter wrote:
> This bit me just now:
> 
>     if (-1 < somearray.length) {
>     }
> 
> Ouch.  Took me a while to figure out that that bit of code was the problem.
> 
> Is there some way to make these kind of bugs less likely or easier to find?

It should be made illegal :) In particular:
- comparing a negative constant with an unsigned variable
- comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int)
- setting an unsigned var to a negative constant (uint a = -1)

For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:

int a;
uint b;

a < b  should be  a < 0 || cast(uint)a < b
a > b  should be  a > 0 && cast(uint)a > b
etc.etc.

I'm quite sure someone will say that's much slower so shouldn't be done, but:
- you can do it properly and only compare variables of the same type in the first place
- you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison
- I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference

Thoughts? :)


xs0

December 05, 2006

Re: Length comparison

Posted by Lionello Lunesu
in reply to xs0

Lionello Lunesu

Posted in reply to xs0

xs0 wrote:
> Bill Baxter wrote:
>> This bit me just now:
>>
>>     if (-1 < somearray.length) {
>>     }
>>
>> Ouch.  Took me a while to figure out that that bit of code was the problem.
>>
>> Is there some way to make these kind of bugs less likely or easier to find?
> 
> It should be made illegal :) In particular:
> - comparing a negative constant with an unsigned variable
> - comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int)
> - setting an unsigned var to a negative constant (uint a = -1)
> 
> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
> 
> int a;
> uint b;
> 
> a < b  should be  a < 0 || cast(uint)a < b
> a > b  should be  a > 0 && cast(uint)a > b
> etc.etc.

http://d.puremagic.com/issues/show_bug.cgi?id=259

Could you please add your remarks to that bugzilla issue as well? Especially the proper implementation for mixed comparison. I agree that the result should at least be correct, no matter* how slow.

L.

* of course, within limits :)

December 05, 2006

Re: Length comparison

Posted by Don Clugston
in reply to xs0

Don Clugston

Posted in reply to xs0

xs0 wrote:
> Bill Baxter wrote:
>> This bit me just now:
>>
>>     if (-1 < somearray.length) {
>>     }
>>
>> Ouch.  Took me a while to figure out that that bit of code was the problem.
>>
>> Is there some way to make these kind of bugs less likely or easier to find?
> 
> It should be made illegal :) In particular:
> - comparing a negative constant with an unsigned variable
> - comparing a too large constant with a signed variable (like 3_000_000_000 and something of type int)
> - setting an unsigned var to a negative constant (uint a = -1)
> 
> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
> 
> int a;
> uint b;
> 
> a < b  should be  a < 0 || cast(uint)a < b
> a > b  should be  a > 0 && cast(uint)a > b
> etc.etc.
> 
> I'm quite sure someone will say that's much slower so shouldn't be done, but:
> - you can do it properly and only compare variables of the same type in the first place
> - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison

But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b  !!!
You've introduced a bug.

> - I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference

I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.

So you have to declare such variables as either 'int' (implying that they can in theory be <0) or as 'uint' (implying that that they can in theory be >int.max). Neither choice is good. There's no way to "do it properly". It's a personal tradeoff which one you use.

And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.

December 05, 2006

Re: Length comparison

Posted by Frits van Bommel
in reply to Don Clugston

Frits van Bommel

Posted in reply to Don Clugston

Don Clugston wrote:
> xs0 wrote:
<snip>
>> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
>>
>> int a;
>> uint b;
>>
>> a < b  should be  a < 0 || cast(uint)a < b
>> a > b  should be  a > 0 && cast(uint)a > b
>> etc.etc.
>>
>> I'm quite sure someone will say that's much slower so shouldn't be done, but:
>> - you can do it properly and only compare variables of the same type in the first place
>> - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison
> 
> But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b  !!!
> You've introduced a bug.

I think his point was that the compiler should perform this transformation automatically, i.e. compile "a < b" as if it said "(a < 0 || cast(uint)a < b)".
Extrapolating, if 'a' was changed to a long, the compiler should then compile it as if transformed as above, except with the cast changed to "cast(ulong)".
So yes, if done manually this introduces a bug, but if that just becomes the new meaning of comparing a signed and unsigned integer then it should work without a problem. I think this is a good idea.

>> - I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference
> 
> I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.

If opCast could be overloaded this might be possible with a struct, but it wouldn't be pretty (at least, not without also needing implicit casts and overloaded opAssign).
But AFAIK as it is, in the current language you're right that this can't be done.

> So you have to declare such variables as either 'int' (implying that they can in theory be <0) or as 'uint' (implying that that they can in theory be >int.max). Neither choice is good. There's no way to "do it properly". It's a personal tradeoff which one you use.
> 
> And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.

Yes it would be a good thing if this was detected.
IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).

December 05, 2006

Re: Length comparison

Posted by Don Clugston
in reply to Frits van Bommel

Don Clugston

Posted in reply to Frits van Bommel

Frits van Bommel wrote:
> Don Clugston wrote:
>> xs0 wrote:
> <snip>
>>> For other cases, comparing signed with unsigned should produce a warning and, more importantly, work properly:
>>>
>>> int a;
>>> uint b;
>>>
>>> a < b  should be  a < 0 || cast(uint)a < b
>>> a > b  should be  a > 0 && cast(uint)a > b
>>> etc.etc.
>>>
>>> I'm quite sure someone will say that's much slower so shouldn't be done, but:
>>> - you can do it properly and only compare variables of the same type in the first place
>>> - you can trivially make it fast again by casting, with the positive side effect of explicitly stating whether you want to use signed or unsigned comparison
>>
>> But then you have to explicitly specify the type. In your code above, suppose that later on, someone changes 'a' to 'long', and sets a = (1L+int.max)*10 , b=7. Now, the code compiles, but it tells you that a < b  !!!
>> You've introduced a bug.
> 
> I think his point was that the compiler should perform this transformation automatically, i.e. compile "a < b" as if it said "(a < 0 || cast(uint)a < b)".
> Extrapolating, if 'a' was changed to a long, the compiler should then compile it as if transformed as above, except with the cast changed to "cast(ulong)".
> So yes, if done manually this introduces a bug, but if that just becomes the new meaning of comparing a signed and unsigned integer then it should work without a problem. I think this is a good idea.
> 
>>> - I doubt signed/unsigned comparisons are common enough to even make any noticeable speed difference
>>
>> I think the underlying problem is, there's no way to specify a 'positiveint' type, ie, a number in the range 0..int.max, which can therefore be cast to both int and uint without error.
> 
> If opCast could be overloaded this might be possible with a struct, but it wouldn't be pretty (at least, not without also needing implicit casts and overloaded opAssign).
> But AFAIK as it is, in the current language you're right that this can't be done.
> 
>> So you have to declare such variables as either 'int' (implying that they can in theory be <0) or as 'uint' (implying that that they can in theory be >int.max). Neither choice is good. There's no way to "do it properly". It's a personal tradeoff which one you use.
>>
>> And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.
> 
> Yes it would be a good thing if this was detected.
> IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).

But if a mismatch was detected, would there be any way to request a fast comparison without introducing bugs?

December 05, 2006

Re: Length comparison

Posted by Frits van Bommel
in reply to Don Clugston

Frits van Bommel

Posted in reply to Don Clugston

Don Clugston wrote:
> Frits van Bommel wrote:
>> Don Clugston wrote:
<snip>
>>> And for this reason, comparing signed variables with unsigned variables is almost never an error (when people use 'uint', they almost always mean 'positive int'). Detecting mismatch comparisons with literals would be a good thing, though.
>>
>> Yes it would be a good thing if this was detected.
>> IMHO it would be an even better thing if the automatic transformation above would then be performed when such a mismatch is detected :).
> 
> But if a mismatch was detected, would there be any way to request a fast comparison without introducing bugs?

Depends on what you mean by fast and what you mean by bugs ;).

Hmm...
If "a < 0 || cast(uint)a < b" isn't fast enough for your liking, then probably the only way to make it faster would be to eliminate the "a < 0" part, which would require writing out "cast(uint)a < b" in full. That  might introduce bugs if the type of a was changed to something that can have values bigger than an uint can hold.
So yeah, that might be a problem.

The reality of it is of course that this change probably won't happen anyway, because of Walter's whole "if it looks like C, it behaves like C" philosophy. If he holds to that the best we can probably hope for is some way to express this that doesn't look like C. Maybe some new operators for these "safe comparison" operations...
Unless... What does the C standard say about comparing negative signed numbers to unsigned numbers? If the result is Undefined(TM) (or implementation-defined) then defining it in D would not be limited by this :).

Forgive me if this is all a bit too much stream-of-consciousness[1], but writing about things is a good way to think about them :).

[1]: Is that the right term? Not sure. Not a term I use a lot ;) (especially in English)

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation