Thread overview | ||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
May 16, 2013 primitive value overflow | ||||
---|---|---|---|---|
| ||||
Hello everyone. Today I ran into a interesting issue. I wrote auto offset = text1.length - text2.length; and in case text2 was longer then text1 I got something around 4294967291. So I opened an issue: http://d.puremagic.com/issues/show_bug.cgi?id=10093 I know that there is a perfectly valid reason for this behavior, and that this behavior is not undefined, but it is unexpected, especially because unsigned is never mentioned in the code. One solution that comes to mind is changing length to signed, but that makes no sense because length is never negative. After some thinking a though came that maybe such value overflow should be treated the same way as array overflow and checked by druntime with optional disabling in production code (like array bound checks)? I think it would be very helpful to get an error for such mistake (that could very easily happen by accident), and on the other hand it can be disabled (like all other checks). |
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to luka8088 | On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
> Hello everyone.
>
> Today I ran into a interesting issue. I wrote
>
> auto offset = text1.length - text2.length;
Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
|
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrej Mitrovic | On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:
> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>> Hello everyone.
>>
>> Today I ran into a interesting issue. I wrote
>>
>> auto offset = text1.length - text2.length;
>
> Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
It's exactly the same as checking if(text1.length > text2.length).
But the idea of checking for integer overflows in debug builds is really nice.
P.S. I remember Microsoft had some serious bug because of an integer overflow, that allowed a remote machine to create a denial of service.
|
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Andrej Mitrovic | On 16.5.2013. 22:29, Andrej Mitrovic wrote:
> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>> Hello everyone.
>>
>> Today I ran into a interesting issue. I wrote
>>
>> auto offset = text1.length - text2.length;
>
> Yeah, I don't like these bugs either. In the meantime you can swap auto
> with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
> non-negative.
Yes, thanks for the advice, I did something similar. =)
|
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mr. Anonymous | On 16.5.2013. 22:35, Mr. Anonymous wrote:
> On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:
>> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>>> Hello everyone.
>>>
>>> Today I ran into a interesting issue. I wrote
>>>
>>> auto offset = text1.length - text2.length;
>>
>> Yeah, I don't like these bugs either. In the meantime you can swap
>> auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
>> non-negative.
>
> It's exactly the same as checking if(text1.length > text2.length).
> But the idea of checking for integer overflows in debug builds is really
> nice.
>
> P.S. I remember Microsoft had some serious bug because of an integer
> overflow, that allowed a remote machine to create a denial of service.
I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.
|
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to luka8088 | On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
> I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.
Well, I'm not sure what can be done about that. length is size_t, which is the unsigned integral value which matches the architecture (uint for 32-bit and ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't read it recently). If it's not, then it should be made clearer, but using size_t for length is pervasive in D as it is in C++, and if you know the standard types, you know what size_t is.
As for overflow checking, it's come up quite a few times, and Walter is completely against it. The hardware doesn't support it, and it would definitely be slow if it were added. The standard way to handle that if you want it is to create user-defined integral type which does the checks (though that obviously won't help you when you can't control the types that you're dealing with). But if you want to add checks in your code, you can always create a wrapper function for doing the subtraction. And if you wanted the checks to only be there in non-release mode, then you could even put the checks in a version(assert) block. So, you can add checks if you want them, but there's pretty much no way that how unsigned or overlflow are handled in the language is going to change.
- Jonathan M Davis
|
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
On 5/16/13, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> The hardware doesn't support it, and it would definitely
> be slow if it were added.
We could add it in -debug mode perhaps. Ideally this sort of thing would be its own switch, but yes that goes towards switch proliferation, I know the story.
|
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Jonathan M Davis | On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote: > On Thursday, May 16, 2013 22:42:23 luka8088 wrote: >> I agree that it is exactly the same as checking if (text1.length > >> text2.length). And I don't think that this is an issues if you are aware >> of the fact that you are working with unsigned values. But in the code >> that I wrote there was no mentioning of unsigned so the possibility of >> that kind of issue never came to mind until I actually printed the >> values. And that is what I wanted to emphasize. > > Well, I'm not sure what can be done about that. length is size_t, which is the > unsigned integral value which matches the architecture (uint for 32-bit and > ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't > read it recently). If it's not, then it should be made clearer, but using > size_t for length is pervasive in D as it is in C++, and if you know the > standard types, you know what size_t is. > > As for overflow checking, it's come up quite a few times, and Walter is > completely against it. The hardware doesn't support it, and it would definitely > be slow if it were added. The standard way to handle that if you want it is to > create user-defined integral type which does the checks (though that obviously > won't help you when you can't control the types that you're dealing with). But > if you want to add checks in your code, you can always create a wrapper > function for doing the subtraction. And if you wanted the checks to only be > there in non-release mode, then you could even put the checks in a > version(assert) block. So, you can add checks if you want them, but there's > pretty much no way that how unsigned or overlflow are handled in the language > is going to change. > > - Jonathan M Davis I agree with Walter if we're talking about production code, but I think it could be very helpful for debug builds. P.S. > The hardware doesn't support it That's not completely true. e.g. x86, while it doesn't throw an exception on an overflow, it does set a flag, which could be relatively cheaply checked. |
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to Mr. Anonymous Attachments:
| On 05/16/2013 04:17 PM, Mr. Anonymous wrote:
> On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote:
>> On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
>>> I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.
>>
>> Well, I'm not sure what can be done about that. length is size_t,
>> which is the
>> unsigned integral value which matches the architecture (uint for
>> 32-bit and
>> ulong for 64-bit). AFAIK, the documentation is clear on this (though I
>> haven't
>> read it recently). If it's not, then it should be made clearer, but using
>> size_t for length is pervasive in D as it is in C++, and if you know the
>> standard types, you know what size_t is.
>>
>> As for overflow checking, it's come up quite a few times, and Walter is
>> completely against it. The hardware doesn't support it, and it would
>> definitely
>> be slow if it were added. The standard way to handle that if you want
>> it is to
>> create user-defined integral type which does the checks (though that
>> obviously
>> won't help you when you can't control the types that you're dealing
>> with). But
>> if you want to add checks in your code, you can always create a wrapper
>> function for doing the subtraction. And if you wanted the checks to
>> only be
>> there in non-release mode, then you could even put the checks in a
>> version(assert) block. So, you can add checks if you want them, but
>> there's
>> pretty much no way that how unsigned or overflow are handled in the
>> language
>> is going to change.
>>
>> - Jonathan M Davis
>
> I agree with Walter if we're talking about production code, but I think it could be very helpful for debug builds.
>
> P.S.
>> The hardware doesn't support it
> That's not completely true.
> e.g. x86, while it doesn't throw an exception on an overflow, it does
> set a flag, which could be relatively cheaply checked.
Perhaps a compiler flag would be nice, can't really argue performance if it's explicitly chosen. Like -profile or -cov.
|
May 16, 2013 Re: primitive value overflow | ||||
---|---|---|---|---|
| ||||
Posted in reply to luka8088 | Am Thu, 16 May 2013 22:39:16 +0200 schrieb luka8088 <luka8088@owave.net>: > On 16.5.2013. 22:29, Andrej Mitrovic wrote: > > On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote: > >> Hello everyone. > >> > >> Today I ran into a interesting issue. I wrote > >> > >> auto offset = text1.length - text2.length; > > > > Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative. > > Yes, thanks for the advice, I did something similar. =) Now that doesn't work if you deal with some text2 that is over 2 GiB longer than text1. My approach is to see the close relation between any offset from beginning or length to the machine memory model. So any byte or char array in memory naturally has an unsigned length typed by the architecture's word size (e.g. 32 or 64 bit). With that in mind I _only_ ever subtract two values if I know the difference will be positive. That is the case for file_size - valid_offset for example. I don't know the context for your line of code, but if text1 and text2 are passed in as parameters to a function, a contract should verify that text1 is longer (or equal) than text2. Now feel free to tell me I'm wrong, but with the two lengths being natural numbers or "countable", I claim that a negative value for your offset variable would not have been usable anyway. It is a result that makes no sense. So on the next line you probably check "if (offset >= 0)" which is the same as putting "if (text1.length >= text2.length)" one line earlier to avoid running into the situation where you can end up with an over- or underflow because the result range of size_t - size_t fits neither size_t nor sizediff_t. Say text1 is 0 bytes long and text2 is 3_000_000_000 bytes long. Then -3_000_000_000 would be the result that cannot be stored in any 32-bit type. And thus it is important to think about possible input to your integer calculations and place if-else-branches there (or in-contracts), especially when the language accepts overflows silently. But I'd really like to see the context of your code if it is not a secret. :) -- Marco |
Copyright © 1999-2021 by the D Language Foundation