May 17, 2013
On Thu, 16 May 2013 23:23:20 +0100, Marco Leise <Marco.Leise@gmx.de> wrote:

> Am Thu, 16 May 2013 22:39:16 +0200
> schrieb luka8088 <luka8088@owave.net>:
>
>> On 16.5.2013. 22:29, Andrej Mitrovic wrote:
>> > On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>> >> Hello everyone.
>> >>
>> >> Today I ran into a interesting issue. I wrote
>> >>
>> >> auto offset = text1.length - text2.length;
>> >
>> > Yeah, I don't like these bugs either. In the meantime you can swap  
>> auto
>> > with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
>> > non-negative.
>>
>> Yes, thanks for the advice, I did something similar. =)
>
> Now that doesn't work if you deal with some text2 that is over
> 2 GiB longer than text1.

This is not a problem I worry about in the general case.  I can honestly say I have never had to write any code which had to handle strings that long.. in fact the longest string length would probably have fit in ushort TBH.

This is why, for better or worse, I cast all C strlen() calls from size_t to int as a matter of course.  Sure, length cannot be negative, but for all practical purposes it will not be larger than signed int either.  I have been doing this for ages now and have had exactly 0 bugs as a result, vs the few I have had using unsigned types and subtraction.

Of course, were I coding something that was intended to specifically handle super sized strings, it would be a different story.

R

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
May 17, 2013
On 5/16/13 5:24 PM, luka8088 wrote:
> Hello everyone.
>
> Today I ran into a interesting issue. I wrote
>
>    auto offset = text1.length - text2.length;
>
> and in case text2 was longer then text1 I got something around 4294967291.
>
> So I opened an issue:
> http://d.puremagic.com/issues/show_bug.cgi?id=10093
>
> I know that there is a perfectly valid reason for this behavior, and
> that this behavior is not undefined, but it is unexpected, especially
> because unsigned is never mentioned in the code. One solution that comes
> to mind is changing length to signed, but that makes no sense because
> length is never negative.

It makes sense if you don't want to have these subtle bugs in your programming language -.-

May 18, 2013
Am Fri, 17 May 2013 14:43:08 -0300
schrieb Ary Borenszweig <ary@esperanto.org.ar>:

> On 5/16/13 5:24 PM, luka8088 wrote:
> > Hello everyone.
> >
> > Today I ran into a interesting issue. I wrote
> >
> >    auto offset = text1.length - text2.length;
> >
> > and in case text2 was longer then text1 I got something around 4294967291.
> >
> > So I opened an issue: http://d.puremagic.com/issues/show_bug.cgi?id=10093
> >
> > I know that there is a perfectly valid reason for this behavior, and that this behavior is not undefined, but it is unexpected, especially because unsigned is never mentioned in the code. One solution that comes to mind is changing length to signed, but that makes no sense because length is never negative.
> 
> It makes sense if you don't want to have these subtle bugs in your programming language -.-

Just saying: the subtraction of two integers creates a resulting range that doesn't fit in ANY integer of same size. If you want to be safe disallow subtraction -.-

-- 
Marco

May 18, 2013
I agree that checks for overflow should exist in debug builds (and not exist in release builds).
May 23, 2013
On 17.5.2013. 0:23, Marco Leise wrote:
> Am Thu, 16 May 2013 22:39:16 +0200
> schrieb luka8088<luka8088@owave.net>:
>
>> On 16.5.2013. 22:29, Andrej Mitrovic wrote:
>>> On Thursday, 16 May 2013 at 20:24:31 UTC, luka8088 wrote:
>>>> Hello everyone.
>>>>
>>>> Today I ran into a interesting issue. I wrote
>>>>
>>>> auto offset = text1.length - text2.length;
>>>
>>> Yeah, I don't like these bugs either. In the meantime you can swap auto
>>> with 'sizediff_t' or 'ptrdiff_t', and then you can check if it's
>>> non-negative.
>>
>> Yes, thanks for the advice, I did something similar. =)
>
> Now that doesn't work if you deal with some text2 that is over
> 2 GiB longer than text1.
> My approach is to see the close relation between any offset
> from beginning or length to the machine memory model. So any
> byte or char array in memory naturally has an unsigned length
> typed by the architecture's word size (e.g. 32 or 64 bit).
> With that in mind I _only_ ever subtract two values if I know
> the difference will be positive. That is the case for
> file_size - valid_offset for example.
> I don't know the context for your line of code, but if text1
> and text2 are passed in as parameters to a function, a
> contract should verify that text1 is longer (or equal) than
> text2.
> Now feel free to tell me I'm wrong, but with the two lengths
> being natural numbers or "countable", I claim that a negative
> value for your offset variable would not have been usable
> anyway. It is a result that makes no sense. So on the next line
> you probably check "if (offset>= 0)" which is the same as
> putting "if (text1.length>= text2.length)" one line earlier
> to avoid running into the situation where you can end up with
> an over- or underflow because the result range of size_t -
> size_t fits neither size_t nor sizediff_t.
>
> Say text1 is 0 bytes long and text2 is 3_000_000_000 bytes
> long. Then -3_000_000_000 would be the result that cannot be
> stored in any 32-bit type. And thus it is important to think
> about possible input to your integer calculations and place
> if-else-branches there (or in-contracts), especially when the
> language accepts overflows silently.
> But I'd really like to see the context of your code if it is
> not a secret. :)
>

I understand perfectly the issue that you are pointing out. But that is not the real issue here. I know how computer arithmetic works, the understanding is not the issue here. The real issue is that at the time of writing unsigned was never mentioned, and it never came to my mind that it could be the issue (until I printed that actual value). So in reality I made a mistake, as I sometimes make a typo mistake. And I fix them (as I did this one) by debugging.

What I want to point out is that this kind of mistakes can be pointed out to the programer in debug mode (and save programer time) by adding a runtime check. The only real benefit here would be shorter debug time, and the only real tradeoff would be slower executing in debug mode. Nothing else.

May 23, 2013
Am Thu, 23 May 2013 19:55:36 +0200
schrieb luka8088 <luka8088@owave.net>:

> I understand perfectly the issue that you are pointing out. But that is not the real issue here. I know how computer arithmetic works, the understanding is not the issue here. The real issue is that at the time of writing unsigned was never mentioned, and it never came to my mind that it could be the issue (until I printed that actual value). So in reality I made a mistake, as I sometimes make a typo mistake. And I fix them (as I did this one) by debugging.
> 
> What I want to point out is that this kind of mistakes can be pointed out to the programer in debug mode (and save programer time) by adding a runtime check. The only real benefit here would be shorter debug time, and the only real tradeoff would be slower executing in debug mode. Nothing else.

Ok, I remember that from Delphi
http://docs.embarcadero.com/products/rad_studio/delphiAndcpp2009/HelpUpdate2/EN/html/devcommon/compdirsoverflowchecking_xml.html
It is a good idea and probably better then some Checked!int
construct. Overflows are only wanted in a few places, so it
makes sense to have this check enabled globally and to disable
it for some code locally.

-- 
Marco

May 23, 2013
On Saturday, 18 May 2013 at 20:29:57 UTC, Minas Mina wrote:
> I agree that checks for overflow should exist in debug builds (and not exist in release builds).

What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)
May 23, 2013
Peter Alexander:

> What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)

Do you rely on signed or unsigned overflow?

My opinions on this topic have changed few times.

A modern system language should offer the programmer both integral types for the rare situations where the overflow or wrap around are expected or acceptable, and other "default" integral types to be used in all the other situations, where overflow or wrap-around are unexpected and not desired. The implementation then should offer ways to optionally perform run-time tests on the second group of integrals.

A very good system language should also offer various means to statically verify the bounds of a certain percentage of values and expression results, to reduce the amount of run-time tests needed (here things like "Liquid Types" help).

D currently doesn't have such safe built-in types, and it doesn't offer means to create such efficient types in library code. I think such means should be provided:
http://d.puremagic.com/issues/show_bug.cgi?id=9850

Bye,
bearophile
May 24, 2013
Am Fri, 24 May 2013 01:35:42 +0200
schrieb "Peter Alexander" <peter.alexander.au@gmail.com>:

> What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)

See my post about Delphi's approach. It could be disabled in D
using pragma(), UDAs or even scopes.

-- 
Marco

May 25, 2013
A)
Requiring a construct such as Checked!int is too complicated for the user
in general as it requires a lot of code change from the user. It may be
useful in certain cases but overflow bugs will crop up in unexpected places.

B)
To help finding such bugs, introduce a special version identifier for
checking arithmetic overflow (eg -version=check_arithmetic_overflow) (no
new compiler flag syntax, just a built in version identifier).

C)
It shouldn't be implied by -debug flag as it may slow down debug code
significantly, making debug code too slow to be useful. Conflating the
"debug" statements and arithmetic overflow is a bad idea; debugging
shouldn't be an all or nothing option.

D)
More generally, there's a trade-off between keeping the number of compiler
flags low for simplicity and having and having more flags for more
expressiveness. I feel dmd is trying too hard to keep the number low.
There's a reason why gcc, clang etc have so many flags, which gives user
finer control. There should be a good middle ground: the defaults should be
simple (optimize vs debug), but when searching for bugs, optimizing or
dealing with deprecated constructs, we want flexibility.

F)
Finally, some code may indeed depend on arithmetic overflow, and shouldn't
be affected by -version=check_arithmetic_overflow. Maybe we need a way to
"unset" a version identifier; here's a possible syntax:

----
module test;
version = check_arithmetic_overflow; //existing syntax
void fun(){}

version -= check_arithmetic_overflow; //new syntax
void fun2(){} //now check_arithmetic_overflow is unset
----

This would be very useful in many ways, eg make some code run fast while the rest is in debug mode (otherwise the debug code would be too slow).



On Thu, May 23, 2013 at 9:13 PM, Marco Leise <Marco.Leise@gmx.de> wrote:

> Am Fri, 24 May 2013 01:35:42 +0200
> schrieb "Peter Alexander" <peter.alexander.au@gmail.com>:
>
> > What about code that relies on overflow? It's well-defined behaviour, so it should be expected that people rely on it (I certainly do sometimes)
>
> See my post about Delphi's approach. It could be disabled in D
> using pragma(), UDAs or even scopes.
>
> --
> Marco
>
>