Jump to page: 1 2 3
Thread overview
primitive value overflow
May 16, 2013
luka8088
May 16, 2013
Andrej Mitrovic
May 16, 2013
Mr. Anonymous
May 16, 2013
luka8088
May 16, 2013
Jonathan M Davis
May 16, 2013
Mr. Anonymous
May 16, 2013
1100110
May 16, 2013
Andrej Mitrovic
May 16, 2013
luka8088
May 16, 2013
Marco Leise
May 17, 2013
Regan Heath
May 23, 2013
luka8088
May 23, 2013
Marco Leise
May 17, 2013
Ary Borenszweig
May 18, 2013
Marco Leise
May 18, 2013
Minas Mina
May 23, 2013
Peter Alexander
May 23, 2013
bearophile
May 29, 2013
luka8088
May 24, 2013
Marco Leise
May 25, 2013
Timothee Cour
May 27, 2013
Marco Leise
May 16, 2013
Hello everyone.

Today I ran into a interesting issue. I wrote

  auto offset = text1.length - text2.length;

and in case text2 was longer then text1 I got something around 4294967291.

So I opened an issue:
http://d.puremagic.com/issues/show_bug.cgi?id=10093

I know that there is a perfectly valid reason for this behavior, and that this behavior is not undefined, but it is unexpected, especially because unsigned is never mentioned in the code. One solution that comes to mind is changing length to signed, but that makes no sense because length is never negative.

After some thinking a though came that maybe such value overflow should be treated the same way as array overflow and checked by druntime with optional disabling in production code (like array bound checks)?

I think it would be very helpful to get an error for such mistake (that could very easily happen by accident), and on the other hand it can be disabled (like all other checks).
May 16, 2013
On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
> Hello everyone.
>
> Today I ran into a interesting issue. I wrote
>
>   auto offset = text1.length - text2.length;

Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
May 16, 2013
On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:
> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>> Hello everyone.
>>
>> Today I ran into a interesting issue. I wrote
>>
>>  auto offset = text1.length - text2.length;
>
> Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.

It's exactly the same as checking if(text1.length > text2.length).
But the idea of checking for integer overflows in debug builds is really nice.

P.S. I remember Microsoft had some serious bug because of an integer overflow, that allowed a remote machine to create a denial of service.
May 16, 2013
On 16.5.2013. 22:29, Andrej Mitrovic wrote:
> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>> Hello everyone.
>>
>> Today I ran into a interesting issue. I wrote
>>
>> auto offset = text1.length - text2.length;
>
> Yeah, I don't like these bugs either. In the meantime you can swap auto
> with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
> non-negative.

Yes, thanks for the advice, I did something similar. =)

May 16, 2013
On 16.5.2013. 22:35, Mr. Anonymous wrote:
> On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:
>> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>>> Hello everyone.
>>>
>>> Today I ran into a interesting issue. I wrote
>>>
>>> auto offset = text1.length - text2.length;
>>
>> Yeah, I don't like these bugs either. In the meantime you can swap
>> auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
>> non-negative.
>
> It's exactly the same as checking if(text1.length > text2.length).
> But the idea of checking for integer overflows in debug builds is really
> nice.
>
> P.S. I remember Microsoft had some serious bug because of an integer
> overflow, that allowed a remote machine to create a denial of service.


I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.
May 16, 2013
On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
> I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.

Well, I'm not sure what can be done about that. length is size_t, which is the unsigned integral value which matches the architecture (uint for 32-bit and ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't read it recently). If it's not, then it should be made clearer, but using size_t for length is pervasive in D as it is in C++, and if you know the standard types, you know what size_t is.

As for overflow checking, it's come up quite a few times, and Walter is completely against it. The hardware doesn't support it, and it would definitely be slow if it were added. The standard way to handle that if you want it is to create user-defined integral type which does the checks (though that obviously won't help you when you can't control the types that you're dealing with). But if you want to add checks in your code, you can always create a wrapper function for doing the subtraction. And if you wanted the checks to only be there in non-release mode, then you could even put the checks in a version(assert) block. So, you can add checks if you want them, but there's pretty much no way that how unsigned or overlflow are handled in the language is going to change.

- Jonathan M Davis
May 16, 2013
On 5/16/13, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> The hardware doesn't support it, and it would definitely
> be slow if it were added.

We could add it in -debug mode perhaps. Ideally this sort of thing would be its own switch, but yes that goes towards switch proliferation, I know the story.
May 16, 2013
On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote:
> On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
>> I agree that it is exactly the same as checking if (text1.length >
>> text2.length). And I don't think that this is an issues if you are aware
>> of the fact that you are working with unsigned values. But in the code
>> that I wrote there was no mentioning of unsigned so the possibility of
>> that kind of issue never came to mind until I actually printed the
>> values. And that is what I wanted to emphasize.
>
> Well, I'm not sure what can be done about that. length is size_t, which is the
> unsigned integral value which matches the architecture (uint for 32-bit and
> ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't
> read it recently). If it's not, then it should be made clearer, but using
> size_t for length is pervasive in D as it is in C++, and if you know the
> standard types, you know what size_t is.
>
> As for overflow checking, it's come up quite a few times, and Walter is
> completely against it. The hardware doesn't support it, and it would definitely
> be slow if it were added. The standard way to handle that if you want it is to
> create user-defined integral type which does the checks (though that obviously
> won't help you when you can't control the types that you're dealing with). But
> if you want to add checks in your code, you can always create a wrapper
> function for doing the subtraction. And if you wanted the checks to only be
> there in non-release mode, then you could even put the checks in a
> version(assert) block. So, you can add checks if you want them, but there's
> pretty much no way that how unsigned or overlflow are handled in the language
> is going to change.
>
> - Jonathan M Davis

I agree with Walter if we're talking about production code, but I think it could be very helpful for debug builds.

P.S.
> The hardware doesn't support it
That's not completely true.
e.g. x86, while it doesn't throw an exception on an overflow, it does set a flag, which could be relatively cheaply checked.
May 16, 2013
On 05/16/2013 04:17 PM, Mr. Anonymous wrote:
> On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote:
>> On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
>>> I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.
>>
>> Well, I'm not sure what can be done about that. length is size_t,
>> which is the
>> unsigned integral value which matches the architecture (uint for
>> 32-bit and
>> ulong for 64-bit). AFAIK, the documentation is clear on this (though I
>> haven't
>> read it recently). If it's not, then it should be made clearer, but using
>> size_t for length is pervasive in D as it is in C++, and if you know the
>> standard types, you know what size_t is.
>>
>> As for overflow checking, it's come up quite a few times, and Walter is
>> completely against it. The hardware doesn't support it, and it would
>> definitely
>> be slow if it were added. The standard way to handle that if you want
>> it is to
>> create user-defined integral type which does the checks (though that
>> obviously
>> won't help you when you can't control the types that you're dealing
>> with). But
>> if you want to add checks in your code, you can always create a wrapper
>> function for doing the subtraction. And if you wanted the checks to
>> only be
>> there in non-release mode, then you could even put the checks in a
>> version(assert) block. So, you can add checks if you want them, but
>> there's
>> pretty much no way that how unsigned or overflow are handled in the
>> language
>> is going to change.
>>
>> - Jonathan M Davis
> 
> I agree with Walter if we're talking about production code, but I think it could be very helpful for debug builds.
> 
> P.S.
>> The hardware doesn't support it
> That's not completely true.
> e.g. x86, while it doesn't throw an exception on an overflow, it does
> set a flag, which could be relatively cheaply checked.


Perhaps a compiler flag would be nice, can't really argue performance if it's explicitly chosen.  Like -profile or -cov.



May 16, 2013
Am Thu, 16 May 2013 22:39:16 +0200
schrieb luka8088 <luka8088@owave.net>:

> On 16.5.2013. 22:29, Andrej Mitrovic wrote:
> > On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
> >> Hello everyone.
> >>
> >> Today I ran into a interesting issue. I wrote
> >>
> >> auto offset = text1.length - text2.length;
> >
> > Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
> 
> Yes, thanks for the advice, I did something similar. =)

Now that doesn't work if you deal with some text2 that is over
2 GiB longer than text1.
My approach is to see the close relation between any offset
from beginning or length to the machine memory model. So any
byte or char array in memory naturally has an unsigned length
typed by the architecture's word size (e.g. 32 or 64 bit).
With that in mind I _only_ ever subtract two values if I know
the difference will be positive. That is the case for
file_size - valid_offset for example.
I don't know the context for your line of code, but if text1
and text2 are passed in as parameters to a function, a
contract should verify that text1 is longer (or equal) than
text2.
Now feel free to tell me I'm wrong, but with the two lengths
being natural numbers or "countable", I claim that a negative
value for your offset variable would not have been usable
anyway. It is a result that makes no sense. So on the next line
you probably check "if (offset >= 0)" which is the same as
putting "if (text1.length >= text2.length)" one line earlier
to avoid running into the situation where you can end up with
an over- or underflow because the result range of size_t -
size_t fits neither size_t nor sizediff_t.

Say text1 is 0 bytes long and text2 is 3_000_000_000 bytes
long. Then -3_000_000_000 would be the result that cannot be
stored in any 32-bit type. And thus it is important to think
about possible input to your integer calculations and place
if-else-branches there (or in-contracts), especially when the
language accepts overflows silently.
But I'd really like to see the context of your code if it is
not a secret. :)

-- 
Marco

« First   ‹ Prev
1 2 3