November 21, 2014
On 11/21/14, 11:47 AM, ketmar via Digitalmars-d wrote:
> On Fri, 21 Nov 2014 11:17:06 -0300
> Ary Borenszweig via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
>> "This bug can manifest itself for arrays whose length (in elements) is
>> 2^30 or greater (roughly a billion elements)"
>>
>> How often does that happen in practice?
> once in almost ten years is too often, as for me. i think that the
> answer must be "never". either no bug, or the code is broken. and one
> of the worst code is the code that "works most of the time", but still
> broken.
>

You see, if you don't use a BigNum for everything than you will always have hidden bugs, be it with int, uint or whatever. The thing is that with int bugs are much less frequent than with uint. So I don't know why you'd rather have uint than int...
November 21, 2014
On 11/21/14, 1:32 PM, Andrei Alexandrescu wrote:
> On 11/21/14 6:17 AM, Ary Borenszweig wrote:
>> On 11/21/14, 5:45 AM, Walter Bright wrote:
>>> On 11/21/2014 12:10 AM, bearophile wrote:
>>>> Walter Bright:
>>>>
>>>>> All you're doing is trading 0 crossing for 0x7FFFFFFF crossing
>>>>> issues, and
>>>>> pretending the problems have gone away.
>>>>
>>>> I'm not pretending anything. I am asking in practical programming what
>>>> of the
>>>> two solutions leads to leas problems/bugs. So far I've seen the
>>>> unsigned
>>>> solution and I've seen it's highly bug-prone.
>>>
>>> I'm suggesting that having a bug and detecting the bug are two different
>>> things. The 0-crossing bug is easier to detect, but that doesn't mean
>>> that shifting the problem to 0x7FFFFFFF crossing bugs is making the bug
>>> count less.
>>>
>>>
>>>>> BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but
>>>>> paradoxically this can make the bug worse, because then it only gets
>>>>> found
>>>>> much, much later in supposedly tested & robust code.
>>>>
>>>> Is this true? Do you have some examples of buggy code?
>>>
>>> http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html
>>>
>>>
>>>
>>
>> "This bug can manifest itself for arrays whose length (in elements) is
>> 2^30 or greater (roughly a billion elements)"
>>
>> How often does that happen in practice?
>
> Every time you read a DVD image :o). I should say that in my doctoral
> work it was often the case I'd have very large arrays.

Oh, sorry, I totally forgot that when you open a DVD with VLC it reads the whole thing to memory.

</sarcasm>
November 21, 2014
Am Fri, 21 Nov 2014 16:32:20 +0000
schrieb "Wyatt" <wyatt.epp@gmail.com>:

> Array lengths are always non-negative integers.  This is axiomatic.  But the subtraction thing keeps coming up in this thread; what to do?
> 
> There's probably something fundamentally wrong with this and I'll probably be called an idiot by both "sides", but my gut feeling is that if expressions with subtraction simply returned a signed type by default, much of the problem would disappear. [...]

As I said above, I always order my unsigned variables by magnitude and uint.max - uint.min should result in uint.max and not -1. In code dealing with lengths or offsets there is typically some "base" that is less than the "position" or an "index" that is less than the "length".

The expression `base - position` is just wrong. If it is in fact below "base" then you will end up with an if-else later on under guarantee. So why not place it up front:

if (position >= base)
{
    auto offset = position - base;
}
else
{
    …
}

> [...]
> 
> -Wyatt
> 
> PS: I can't even believe how this thread has blown up, considering how it started.

Exactly my thought, but suddenly I couldn't stop myself from posting.

-- 
Marco

November 21, 2014
Am Thu, 20 Nov 2014 20:53:31 -0800
schrieb Walter Bright <newshound2@digitalmars.com>:

> On 11/20/2014 7:11 PM, Walter Bright wrote:
> > On 11/20/2014 3:25 PM, bearophile wrote:
> >> Walter Bright:
> >>
> >>> If that is changed to a signed type, then you'll have a same-only-different set of subtle bugs,
> >>
> >> This is possible. Can you show some of the bugs, we can discuss them, and see if they are actually worse than the current situation.
> >
> > All you're doing is trading 0 crossing for 0x7FFFFFFF crossing issues, and pretending the problems have gone away.
> 
> BTW, granted the 0x7FFFFFFF problems exhibit the bugs less often, but paradoxically this can make the bug worse, because then it only gets found much, much later in supposedly tested & robust code.
> 
> 0 crossing bugs tend to show up much sooner, and often immediately.

 +1000. This is also the reason we have a special float .init in D.
There is no plethora of bugs to show, because they are under
the radar. Signed types are only more convenient in the
scripting language sense, like using double for everything and
array indexing in JavaScript.

-- 
Marco

November 21, 2014
On Fri, 21 Nov 2014 14:38:26 -0300
Ary Borenszweig via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> You see, if you don't use a BigNum for everything than you will always have hidden bugs, be it with int, uint or whatever.
why do you believe that i'm not aware of overflows and don't checking for that? i'm used to think about overflows and do overflow checking in production code since my Z80 days. and i don't believe that "infrequent bug" is better than "frequent bug". both are equally bad.


November 21, 2014
On Fri, 21 Nov 2014 09:08:54 -0800
"H. S. Teoh via Digitalmars-d" <digitalmars-d@puremagic.com> wrote:

> > >>What about:
> > >>
> > >>      uint x;
> > >>      auto z = x - 1;
> > >>
> > >>?
> > >>
> > >here z must be `long`. and for `ulong` compiler must emit error.
> 
> What if x==uint.max?
nothing bad, long is perfectly able to represent that.

> > Would you agree that that would break a substantial amount of correct D code? -- Andrei
> 
> Yeah I don't think it's a good idea for subtraction to yield a different type from its operands. Non-closure of operators (i.e., results are of a different type than operands) leads to a lot of frustration because you keep ending up with the wrong type, and inevitably people will just throw in random casts everywhere just to make things work.
not any subtraction, only that with `auto` vardecl.


November 21, 2014
On Fri, 21 Nov 2014 14:36:53 -0300
Ary Borenszweig via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On 11/21/14, 11:29 AM, ketmar via Digitalmars-d wrote:
> > On Fri, 21 Nov 2014 19:31:23 +1100
> > Daniel Murphy via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> >
> >> "bearophile"  wrote in message news:lkcltlokangpzzdzzfjg@forum.dlang.org...
> >>
> >>>  From my experience in coding in D they are far more unlikely than
> >>> sign-related bugs of array lengths.
> >>
> >> Here's a simple program to calculate the relative size of two files, that will not work correctly with unsigned lengths.
> >>
> >> module sizediff
> >>
> >> import std.file;
> >> import std.stdio;
> >>
> >> void main(string[] args)
> >> {
> >>      assert(args.length == 3, "Usage: sizediff file1 file2");
> >>      auto l1 = args[1].read().length;
> >>      auto l2 = args[2].read().length;
> >>      writeln("Difference: ", l1 - l2);
> >> }
> >>
> >> The two ways this can fail (that I want to highlight) are:
> >> 1. If either file is too large to fit in a size_t the result will (probably)
> >> be wrong
> >> 2. If file2 is bigger than file1 the result will be wrong
> >>
> >> If length was signed, problem 2 would not exist, and problem 1 would be more likely to occur.  I think it's clear that signed lengths would work for more possible realistic inputs.
> > no, the problem 2 just becomes hidden. while the given code works most of the time, it is still broken.
> 
> So how would you solve problem 2?
with proper check before doing subtraction. or by switching to some Scheme compiler with full numeric tower.


November 21, 2014
On Friday, 21 November 2014 at 16:12:19 UTC, Don wrote:
> On Friday, 21 November 2014 at 15:50:05 UTC, H. S. Teoh via Digitalmars-d wrote:
>> On Fri, Nov 21, 2014 at 03:36:01PM +0000, Don via Digitalmars-d wrote:
>> [...]
>>> Suppose  D had a type 'natint', which could hold natural numbers in
>>> the range 0..uint.max.  Sounds like 'uint', right? People make the
>>> mistake of thinking that is what uint is. But it is not.
>>> 
>>> How would natint behave, in the type system?
>>> 
>>> typeof (natint - natint)  ==  int     NOT natint  !!!
>>
>> Wrong. (uint.max - 0) == uint.max, which is of type uint.
>
>
> It is not uint.max. It is natint.max. And yes, that's an overflow condition.
>
> Exactly the same as when you do int.max + int.max.
>
>> If you
>> interpret it as int, you get a negative number, which is wrong. So your
>> proposal breaks uint in even worse ways, in that now subtracting a
>> smaller number from a larger number may overflow, whereas it wouldn't
>> before. So that fixes nothing, you're just shifting the problem
>> somewhere else.
>>
>>
>> T
>
> This is not a proposal!!!! I am just illustrating the difference between what people *think* uint does, vs what it actually does.
>
> The type that I think would be useful, would be a number in the range 0..int.max.
> It has no risk of underflow.
>
> To put it another way:
>
> natural numbers are a subset of mathematical integers.
>   (the range 0..infinity)
>
> signed types are a subset of mathematical integers
>   (the range -int.max .. int.max).
>
> unsigned types are not a subset of mathematical integers.
>
> They do not just have a restricted range. They have different semantics.

I was under the impression that in D:

uint = { x mod 2^32 | x ∈ Z_0 }
int = { x - 2^31 | x ∈ uint }

which matches the hardware.
November 21, 2014
On 11/21/2014 10:05 AM, ketmar via Digitalmars-d wrote:
> why do you believe that i'm not aware of overflows and don't checking
> for that? i'm used to think about overflows and do overflow checking in
> production code since my Z80 days. and i don't believe that "infrequent
> bug" is better than "frequent bug". both are equally bad.


Having coded with 16 bit computers for decades, one gets used to thinking about and dealing with overflows :-)
November 21, 2014
On 11/21/2014 6:03 AM, ketmar via Digitalmars-d wrote:
> On Thu, 20 Nov 2014 13:28:37 -0800
> Walter Bright via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
>> On 11/20/2014 7:52 AM, H. S. Teoh via Digitalmars-d wrote:
>>> What *could* be improved, is the prevention of obvious mistakes in
>>> *mixing* signed and unsigned types. Right now, D allows code like the
>>> following with no warning:
>>>
>>> 	uint x;
>>> 	int y;
>>> 	auto z = x - y;
>>>
>>> BTW, this one is the same in essence as an actual bug that I fixed in
>>> druntime earlier this year, so downplaying it as a mistake people make
>>> 'cos they confound computer math with math math is fallacious.
>>
>> What about:
>>
>>       uint x;
>>       auto z = x - 1;
>>
>> ?
>>
> here z must be `long`. and for `ulong` compiler must emit error.


So, any time an integer literal appears in an unsigned expression, the type of the expression becomes signed?