primitive value overflow - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » primitive value overflow

Thread overview

primitive value overflow
May 16, 2013 luka8088
May 16, 2013 Andrej Mitrovic
May 16, 2013 Mr. Anonymous
May 16, 2013 luka8088
May 16, 2013 Jonathan M Davis
May 16, 2013 Mr. Anonymous
May 16, 2013 1100110
May 16, 2013 Andrej Mitrovic
May 16, 2013 luka8088
May 16, 2013 Marco Leise
May 17, 2013 Regan Heath
May 23, 2013 luka8088
May 23, 2013 Marco Leise
May 17, 2013 Ary Borenszweig
May 18, 2013 Marco Leise
May 18, 2013 Minas Mina
May 23, 2013 Peter Alexander
May 23, 2013 bearophile
May 29, 2013 luka8088
May 24, 2013 Marco Leise
May 25, 2013 Timothee Cour
May 27, 2013 Marco Leise

May 16, 2013

primitive value overflow

Posted by luka8088

luka8088

Hello everyone.

Today I ran into a interesting issue. I wrote

  auto offset = text1.length - text2.length;

and in case text2 was longer then text1 I got something around 4294967291.

So I opened an issue:
http://d.puremagic.com/issues/show_bug.cgi?id=10093

I know that there is a perfectly valid reason for this behavior, and that this behavior is not undefined, but it is unexpected, especially because unsigned is never mentioned in the code. One solution that comes to mind is changing length to signed, but that makes no sense because length is never negative.

After some thinking a though came that maybe such value overflow should be treated the same way as array overflow and checked by druntime with optional disabling in production code (like array bound checks)?

I think it would be very helpful to get an error for such mistake (that could very easily happen by accident), and on the other hand it can be disabled (like all other checks).

May 16, 2013

Re: primitive value overflow

Posted by Andrej Mitrovic
in reply to luka8088

Andrej Mitrovic

Posted in reply to luka8088

On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
> Hello everyone.
>
> Today I ran into a interesting issue. I wrote
>
>   auto offset = text1.length - text2.length;

Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.

May 16, 2013

Re: primitive value overflow

Posted by Mr. Anonymous
in reply to Andrej Mitrovic

Mr. Anonymous

Posted in reply to Andrej Mitrovic

On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:
> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>> Hello everyone.
>>
>> Today I ran into a interesting issue. I wrote
>>
>>  auto offset = text1.length - text2.length;
>
> Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.

It's exactly the same as checking if(text1.length > text2.length).
But the idea of checking for integer overflows in debug builds is really nice.

P.S. I remember Microsoft had some serious bug because of an integer overflow, that allowed a remote machine to create a denial of service.

May 16, 2013

Re: primitive value overflow

Posted by luka8088
in reply to Andrej Mitrovic

luka8088

Posted in reply to Andrej Mitrovic

On 16.5.2013. 22:29, Andrej Mitrovic wrote:
> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>> Hello everyone.
>>
>> Today I ran into a interesting issue. I wrote
>>
>> auto offset = text1.length - text2.length;
>
> Yeah, I don't like these bugs either. In the meantime you can swap auto
> with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
> non-negative.

Yes, thanks for the advice, I did something similar. =)

May 16, 2013

Re: primitive value overflow

Posted by luka8088
in reply to Mr. Anonymous

luka8088

Posted in reply to Mr. Anonymous

On 16.5.2013. 22:35, Mr. Anonymous wrote:
> On Thursday, 16 May 2013 at 20:29:13 UTC, Andrej Mitrovic wrote:
>> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>>> Hello everyone.
>>>
>>> Today I ran into a interesting issue. I wrote
>>>
>>> auto offset = text1.length - text2.length;
>>
>> Yeah, I don't like these bugs either. In the meantime you can swap
>> auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
>> non-negative.
>
> It's exactly the same as checking if(text1.length > text2.length).
> But the idea of checking for integer overflows in debug builds is really
> nice.
>
> P.S. I remember Microsoft had some serious bug because of an integer
> overflow, that allowed a remote machine to create a denial of service.

I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.

May 16, 2013

Re: primitive value overflow

Posted by Jonathan M Davis
in reply to luka8088

Jonathan M Davis

Posted in reply to luka8088

On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
> I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.

Well, I'm not sure what can be done about that. length is size_t, which is the unsigned integral value which matches the architecture (uint for 32-bit and ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't read it recently). If it's not, then it should be made clearer, but using size_t for length is pervasive in D as it is in C++, and if you know the standard types, you know what size_t is.

As for overflow checking, it's come up quite a few times, and Walter is completely against it. The hardware doesn't support it, and it would definitely be slow if it were added. The standard way to handle that if you want it is to create user-defined integral type which does the checks (though that obviously won't help you when you can't control the types that you're dealing with). But if you want to add checks in your code, you can always create a wrapper function for doing the subtraction. And if you wanted the checks to only be there in non-release mode, then you could even put the checks in a version(assert) block. So, you can add checks if you want them, but there's pretty much no way that how unsigned or overlflow are handled in the language is going to change.

- Jonathan M Davis

May 16, 2013

Re: primitive value overflow

Posted by Andrej Mitrovic

Andrej Mitrovic

On 5/16/13, Jonathan M Davis <jmdavisProg@gmx.com> wrote:
> The hardware doesn't support it, and it would definitely
> be slow if it were added.

We could add it in -debug mode perhaps. Ideally this sort of thing would be its own switch, but yes that goes towards switch proliferation, I know the story.

May 16, 2013

Re: primitive value overflow

Posted by Mr. Anonymous
in reply to Jonathan M Davis

Mr. Anonymous

Posted in reply to Jonathan M Davis

On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote:
> On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
>> I agree that it is exactly the same as checking if (text1.length >
>> text2.length). And I don't think that this is an issues if you are aware
>> of the fact that you are working with unsigned values. But in the code
>> that I wrote there was no mentioning of unsigned so the possibility of
>> that kind of issue never came to mind until I actually printed the
>> values. And that is what I wanted to emphasize.
>
> Well, I'm not sure what can be done about that. length is size_t, which is the
> unsigned integral value which matches the architecture (uint for 32-bit and
> ulong for 64-bit). AFAIK, the documentation is clear on this (though I haven't
> read it recently). If it's not, then it should be made clearer, but using
> size_t for length is pervasive in D as it is in C++, and if you know the
> standard types, you know what size_t is.
>
> As for overflow checking, it's come up quite a few times, and Walter is
> completely against it. The hardware doesn't support it, and it would definitely
> be slow if it were added. The standard way to handle that if you want it is to
> create user-defined integral type which does the checks (though that obviously
> won't help you when you can't control the types that you're dealing with). But
> if you want to add checks in your code, you can always create a wrapper
> function for doing the subtraction. And if you wanted the checks to only be
> there in non-release mode, then you could even put the checks in a
> version(assert) block. So, you can add checks if you want them, but there's
> pretty much no way that how unsigned or overlflow are handled in the language
> is going to change.
>
> - Jonathan M Davis

I agree with Walter if we're talking about production code, but I think it could be very helpful for debug builds.

P.S.
> The hardware doesn't support it
That's not completely true.
e.g. x86, while it doesn't throw an exception on an overflow, it does set a flag, which could be relatively cheaply checked.

May 16, 2013

Re: primitive value overflow

Posted by 1100110
in reply to Mr. Anonymous

1100110

Posted in reply to Mr. Anonymous

Attachments:

signature.asc (OpenPGP digital signature)

On 05/16/2013 04:17 PM, Mr. Anonymous wrote:
> On Thursday, 16 May 2013 at 21:04:38 UTC, Jonathan M Davis wrote:
>> On Thursday, May 16, 2013 22:42:23 luka8088 wrote:
>>> I agree that it is exactly the same as checking if (text1.length > text2.length). And I don't think that this is an issues if you are aware of the fact that you are working with unsigned values. But in the code that I wrote there was no mentioning of unsigned so the possibility of that kind of issue never came to mind until I actually printed the values. And that is what I wanted to emphasize.
>>
>> Well, I'm not sure what can be done about that. length is size_t,
>> which is the
>> unsigned integral value which matches the architecture (uint for
>> 32-bit and
>> ulong for 64-bit). AFAIK, the documentation is clear on this (though I
>> haven't
>> read it recently). If it's not, then it should be made clearer, but using
>> size_t for length is pervasive in D as it is in C++, and if you know the
>> standard types, you know what size_t is.
>>
>> As for overflow checking, it's come up quite a few times, and Walter is
>> completely against it. The hardware doesn't support it, and it would
>> definitely
>> be slow if it were added. The standard way to handle that if you want
>> it is to
>> create user-defined integral type which does the checks (though that
>> obviously
>> won't help you when you can't control the types that you're dealing
>> with). But
>> if you want to add checks in your code, you can always create a wrapper
>> function for doing the subtraction. And if you wanted the checks to
>> only be
>> there in non-release mode, then you could even put the checks in a
>> version(assert) block. So, you can add checks if you want them, but
>> there's
>> pretty much no way that how unsigned or overflow are handled in the
>> language
>> is going to change.
>>
>> - Jonathan M Davis
> 
> I agree with Walter if we're talking about production code, but I think it could be very helpful for debug builds.
> 
> P.S.
>> The hardware doesn't support it
> That's not completely true.
> e.g. x86, while it doesn't throw an exception on an overflow, it does
> set a flag, which could be relatively cheaply checked.


Perhaps a compiler flag would be nice, can't really argue performance if it's explicitly chosen.  Like -profile or -cov.

May 16, 2013

Re: primitive value overflow

Posted by Marco Leise
in reply to luka8088

Marco Leise

Posted in reply to luka8088

Am Thu, 16 May 2013 22:39:16 +0200
schrieb luka8088 <luka8088@owave.net>:

> On 16.5.2013. 22:29, Andrej Mitrovic wrote:
> > On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
> >> Hello everyone.
> >>
> >> Today I ran into a interesting issue. I wrote
> >>
> >> auto offset = text1.length - text2.length;
> >
> > Yeah, I don't like these bugs either. In the meantime you can swap auto with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's non-negative.
> 
> Yes, thanks for the advice, I did something similar. =)

Now that doesn't work if you deal with some text2 that is over
2 GiB longer than text1.
My approach is to see the close relation between any offset
from beginning or length to the machine memory model. So any
byte or char array in memory naturally has an unsigned length
typed by the architecture's word size (e.g. 32 or 64 bit).
With that in mind I _only_ ever subtract two values if I know
the difference will be positive. That is the case for
file_size - valid_offset for example.
I don't know the context for your line of code, but if text1
and text2 are passed in as parameters to a function, a
contract should verify that text1 is longer (or equal) than
text2.
Now feel free to tell me I'm wrong, but with the two lengths
being natural numbers or "countable", I claim that a negative
value for your offset variable would not have been usable
anyway. It is a result that makes no sense. So on the next line
you probably check "if (offset >= 0)" which is the same as
putting "if (text1.length >= text2.length)" one line earlier
to avoid running into the situation where you can end up with
an over- or underflow because the result range of size_t -
size_t fits neither size_t nor sizediff_t.

Say text1 is 0 bytes long and text2 is 3_000_000_000 bytes
long. Then -3_000_000_000 would be the result that cannot be
stored in any 32-bit type. And thus it is important to think
about possible input to your integer calculations and place
if-else-branches there (or in-contracts), especially when the
language accepts overflows silently.
But I'd really like to see the context of your code if it is
not a secret. :)

-- 
Marco

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation