Treating the abusive unsigned syndrome (page 10)

Derek Parnell wrote: > On Fri, 28 Nov 2008 17:09:25 +0100, Don wrote: > >> It's close, but how can code such as: >> >> if (x.length - y.length < 100) ... >> >> be correct in the presence of length > 2GB? > > It could be transformed by the compiler into more something like ... > > if ((x.length <= y.length) || ((x.length - y.length) < 100)) ... Then it'd have different behavior from ---- auto diff = x.length - y.length; if (diff < 100) ... ---- This seems like a *bad* thing...

On Sat, 29 Nov 2008 01:17:27 +0100, Frits van Bommel wrote: > Then it'd have different behavior from > ---- > auto diff = x.length - y.length; > if (diff < 100) ... > ---- > > This seems like a *bad* thing... I see the problem a little differently. To me, "x.length - y.length" is ambiguous and thus meaningless. The ambiguity is are you after the difference between two values or are you after the value required to add to x.length to get to y.length? These are not necessarily the same thing. The difference is always positive, as in the difference between the length of X and length of Y is 4. The answer tells us the difference between two lengths but not of course which is the smaller. So it all depends on what you are trying to find out. And note that the difference is not a length because it is not associated with any specific array. So having looked at it like this, I'm now inclined to consider that the 'diff' being declared here should be a signed type and, if possible, have more bits than '.length'. -- Derek Parnell Melbourne, Australia skype: derek.j.parnell

December 01, 2008

Re: Treating the abusive unsigned syndrome

Posted by Fawzi Mohamed
in reply to Don

Permalink

Fawzi Mohamed

Posted in reply to Don

Permalink

On 2008-11-28 17:44:39 +0100, Don <nospam@nospam.com> said:

> Andrei Alexandrescu wrote:
>> Don wrote:
>>> Andrei Alexandrescu wrote:
>>>> (I lost track of quotes, so I yanked them all beyond Don's message.)
>>>> 
>>>> Don wrote:
>>>>> The problem with that, is that you're then forcing the 'unsigned is a natural' interpretation when it may be erroneous.
>>>>> 
>>>>> uint.max - 10 is a uint.
>>>>> 
>>>>> It's an interesting case, because int = u1 - u2 is definitely incorrect when u1 > int.max.
>>>>> 
>>>>> uint = u1 - u2 may be incorrect when u1 < u2, _if you think of unsigned as a positive number_.
>>>>> But, if you think of it as a natural modulo 2^32, uint = u1-u2 is always correct, since that's what's happening mathematically.
[...]
>>>> 
>>> Any subtraction of two lengths has a possible range of
>>>  -int.max .. uint.max
>>> which is quite problematic (and the root cause of the problems, I guess).
>>> And unfortunately I think code is riddled with subtraction of lengths.
>> 
>> Code may be riddled with subtraction of lengths, but seems to be working with today's rule that the result of that subtraction is unsigned. So definitely we're not introducing new problems.
> 
> Yes. I think much existing code would fail with sizes over 2GB, though. But it's not any worse.

I found a couple of instances where to compare addresses simply a-b was done, instead of something like ((a<b)?-1:((a==b)?0:1)), so yes this is a pitfall that happens.

Note that normally the subtraction of lengths is ok (because normally one is interested in the result and a>b), it is when it is used as quick way to introduce ordering (i.e. as comparison) that it becomes problematic.

By the way the solution of going beyond 2GB is clearly using size_t, as I think is done (at least in tango).

Fawzi

Forums