February 16, 2010
== Quote from Robert Jacques (sandford@jhu.edu)'s article
> Actually, this bug is more common than that; overflow can happen on arrays of length uint.max/2 and that's to say nothing of using 64-bit code. Also, the std.algorithm binary search routines use a different algorithm that appears to be safe to use. (Though they won't compile in 64-bit mode due to a minor bug)

Well, really you should be using size_t instead of int (which I do) to deal with the 64-bit issue.  I guess there is a good point here in that sense.  However, assuming size_t.sizeof == (void*).sizeof and half your address space is reserved for kernel use, the only way this bug could bite you is on absurdly large arrays of bytes, where binary search is almost always the wrong algorithm anyhow.
February 16, 2010
On Tue, Feb 16, 2010 at 11:26:37AM -0600, Ellery Newcomer wrote:
> OT: has anyone written a wrapper for int/long/etc that throws exceptions on overflow/underflow? Maybe such a thing should exist in the standard library?

Something along these lines should work (pasted at bottom of message).

I'd like it a lot more if it could just be opBinary!(string)() -- the struct
would be tiny.

Suckily, assigning an it to it when declaring it doesn't work. I think there's a way around this, but I don't know.

The opPow's are commented out since my dmd is too old, so I couldn't test it.


========

import std.stdio;

struct NoOverflow(T) {
	T _payload;
	alias _payload this;

	T opAdd(T a) {
		T tmp = _payload + a;
		asm { jo overflow; }
		return tmp;
		overflow:
			throw new Exception("Overflow");
	}

	T opSub(T a) {
		T tmp = _payload - a;
		asm { jo overflow; }
		return tmp;
		overflow:
			throw new Exception("Overflow");
	}

	T opMul(T a) {
		T tmp = _payload * a;
		asm { jo overflow; }
		return tmp;
		overflow:
			throw new Exception("Overflow");
	}

	T opDiv(T a) {
		T tmp = _payload / a;
		asm { jo overflow; }
		return tmp;
		overflow:
			throw new Exception("Overflow");
	}

	/+
	T opPow(T a) {
		T tmp = _payload ^^ a;
		asm { jo overflow; }
		return tmp;
		overflow:
			throw new Exception("Overflow");
	}
	+/

	T opAddAssign(T a) {
		_payload += a;
		asm { jo overflow; }
		return this;
		overflow:
			throw new Exception("Overflow");
	}

	T opSubAssign(T a) {
		_payload -= a;
		asm { jo overflow; }
		return this;
		overflow:
			throw new Exception("Overflow");
	}

	T opMulAssign(T a) {
		_payload *= a;
		asm { jo overflow; }
		return this;
		overflow:
			throw new Exception("Overflow");
	}

	T opDivAssign(T a) {
		_payload /= a;
		asm { jo overflow; }
		return this;
		overflow:
			throw new Exception("Overflow");
	}

	T opPowAssign(T a) {
		//_payload ^^= a;
		asm { jo overflow; }
		return this;
		overflow:
			throw new Exception("Overflow");
	}

	T opPostInc() {
		_payload++;
		asm { jo overflow; }
		return this;
		overflow:
			throw new Exception("Overflow");
	}

	T opPostDec() {
		_payload--;
		asm { jo overflow; }
		return this;
		overflow:
			throw new Exception("Overflow");
	}
}

void main() {
	alias NoOverflow!(int) oint;

	oint a;
	a = int.max;
	a--;
	writefln("%d", a);
	a++;
	writefln("%d", a);
	a++; // should throw
	writefln("%d", a);
}

=======


-- 
Adam D. Ruppe
http://arsdnet.net
February 16, 2010
Ellery Newcomer:

> OT: has anyone written a wrapper for int/long/etc that throws exceptions on overflow/underflow? Maybe such a thing should exist in the standard library?

No, that idea is trash. Think about arrays, do you want to litter your code with a different type of array that test the index bounds? Surely not. You want the built-in arrays to test the bounds (with eventually a compilation option to disable such tests), as in D. The same is true for integral numbers, as done in Delphi/C#.

-----------------

Adam D. Ruppe:

> 	T opAdd(T a) {
> 		T tmp = _payload + a;
> 		asm { jo overflow; }
> 		return tmp;
> 		overflow:
> 			throw new Exception("Overflow");
> 	}

There are some problems with that. Currently a struct can't used to fully replace a number, for example you can't do if(a) yet.

LDC has problems with gotos outside/inside the asm block:
Gotos into inline assembly: For labels inside inline asm blocks, the D spec says "They can be the target of goto statements.", this is not supported at the moment. Basically, LLVM does not allow jumping in to or out of an asm block. We work around this for jumping out of asm by converting these branches to assignments to a temporary that is then used in a switch statement right after the inline asm block to jump to the final destination. This same workaround could be applied for jumping into inline assembly.


Another problem is that dmd will not inline those little methods (ldc can be forced to do it). I want some efficiency in such safer operations too, otherwise they become less useful.

Another problem with micro blocks of ASM in D is that the compiler is not always good at managing it, so it can use the stack and registers in a suboptimal way. In LDC they have invented asm Constraints to solve this problem:
http://www.dsource.org/projects/ldc/wiki/InlineAsmExpressions

Bye,
bearophile
February 16, 2010
Steven Schveighoffer wrote:
> What I meant by that statement is that the behavior goes against common sense -- when it doesn't have to.  It only makes sense to advanced programmers who understand the inner workings of the CPU and even in those cases, advance programmers easily make mistakes.

Where you and I disagree is I don't feel that 2s-complement arithmetic is in any way an advanced programming topic. Nor is it an inner working of a CPU - it's an exteriorly visible behavior, well documented in the CPU manuals. (Inner behavior would be things like the microcode.) As I mentioned before, how that works was often the very first topic in an introductory book on programming.

No comprehension of the fundamentals computer arithmetic will lead to failure after failure as a programmer; no language can paper that over. There is no escaping it or pretending it isn't there.


> When the result of an operation is 99.999% of the time an error (in fact the exact percentage is (T.max-1)/T.max  * 100), disallowing it is worth making the rare valid uses of it illegal.

It conforms to the simple rules of 2s-complement arithmetic, so I disagree with calling it an error.


> The case I'm talking about is the equivalent to doing:
> 
> x = x / 0;

Even mathematicians don't know what to do about divide by zero. But 2's complement arithmetic is well defined. So the situations are not comparable.
February 16, 2010
dsimcha wrote:
> I **HATE** this example because it's a classic example of extreme nitpicking.  On
> most modern computers, (void*).sizeof == size_t.sizeof.  Furthermore, usually half
> your address space is reserved for kernel use.  Therefore, this bug would only
> show up when you're searching an array of bytes **and** very close to exhausting
> available address space (i.e. when you probably have bigger problems anyhow).  I
> have intentionally written binary searches like this even though I'm aware of this
> bug because it's more readable and efficient than doing it "right" and would only
> fail in corner cases too extreme to be worth considering.

I agree with you that this "bug" is not worth considering, and that if you have an array that consumes more than half your address space you have other problems that will prevent your program from running.
February 16, 2010
On 02/16/2010 12:03 PM, Adam D. Ruppe wrote:
> On Tue, Feb 16, 2010 at 11:26:37AM -0600, Ellery Newcomer wrote:
>> OT: has anyone written a wrapper for int/long/etc that throws exceptions
>> on overflow/underflow? Maybe such a thing should exist in the standard
>> library?
>
> Something along these lines should work (pasted at bottom of message).
>
> I'd like it a lot more if it could just be opBinary!(string)() -- the struct
> would be tiny.
>
> Suckily, assigning an it to it when declaring it doesn't work. I think there's
> a way around this, but I don't know.
>
> The opPow's are commented out since my dmd is too old, so I couldn't test it.
>
>
> ========
>

Awesome. It doesn't work for opPow, though.

And aha! put a static opCall in conjunction with opAssign and you can assign when declaring it. This ought to be in the docs somewhere where I can find it..
February 16, 2010
On Tue, 16 Feb 2010, dsimcha wrote:

> I **HATE** this example because it's a classic example of extreme nitpicking.  On most modern computers, (void*).sizeof == size_t.sizeof.  Furthermore, usually half your address space is reserved for kernel use.  Therefore, this bug would only show up when you're searching an array of bytes **and** very close to exhausting available address space (i.e. when you probably have bigger problems anyhow).  I have intentionally written binary searches like this even though I'm aware of this bug because it's more readable and efficient than doing it "right" and would only fail in corner cases too extreme to be worth considering.

Actually, linux has used the 3:1 split for as long as I can recall.  That leads to easilly allowing this case to hit.  I've seen it hit in real world apps.  I agree that when you're playing in that neighborhood that you should consider moving to 64 bit apps, but there's downsides there too.  64bit addressing isn't a silver bullet,  unfortunatly.  Any app that's integer or pointer heavy in its data structures pays a big cost.

Later,
Brad

February 16, 2010
On 02/16/2010 12:36 PM, bearophile wrote:
> Ellery Newcomer:
>
>> OT: has anyone written a wrapper for int/long/etc that throws exceptions
>> on overflow/underflow? Maybe such a thing should exist in the standard
>> library?
>
> No, that idea is trash. Think about arrays, do you want to litter your code with a different type of array that test the index bounds? Surely not. You want the built-in arrays to test the bounds (with eventually a compilation option to disable such tests), as in D. The same is true for integral numbers, as done in Delphi/C#.

Sure, I'd sooner have it built in to the compiler. Is it? A quick peek in your dlibs suggests it wasn't when you wrote powMod
February 16, 2010
On Tue, 16 Feb 2010 13:38:06 -0500, Walter Bright <newshound1@digitalmars.com> wrote:

> Steven Schveighoffer wrote:
>> What I meant by that statement is that the behavior goes against common sense -- when it doesn't have to.  It only makes sense to advanced programmers who understand the inner workings of the CPU and even in those cases, advance programmers easily make mistakes.
>
> Where you and I disagree is I don't feel that 2s-complement arithmetic is in any way an advanced programming topic. Nor is it an inner working of a CPU - it's an exteriorly visible behavior, well documented in the CPU manuals. (Inner behavior would be things like the microcode.) As I mentioned before, how that works was often the very first topic in an introductory book on programming.

I agree that 2's complement as a whole is not an advanced topic.  What I disagree with is the interpretation the result of this one operation.  To interpret it as an unsigned value is lunacy.  The result should be interpreted as a negative value and not assignable to an unsigned value.  There are much more sane and unambiguous alternatives to doing this.  Even advanced programmers expect when they negate something it most likely becomes negative.  It's a surprise when it *always* flips back to positive.

>
> No comprehension of the fundamentals computer arithmetic will lead to failure after failure as a programmer; no language can paper that over. There is no escaping it or pretending it isn't there.

You assume that to understand 2s complement is to understand *and* mentally parse why negating a positive value in a computer for unsigned types *always* results in a positive value.  I think you can understand 2s complement arithmetic and the limitations, and *still* make the mistake of assigning the result of negating an unsigned value to an unsigned value.  There have been already very smart, computer literate, 2s complement knowledgeable people who have said on this very newsgroup they have been bitten by this error.  This is not a fix to help just newbies.

>> When the result of an operation is 99.999% of the time an error (in fact the exact percentage is (T.max-1)/T.max  * 100), disallowing it is worth making the rare valid uses of it illegal.
>
> It conforms to the simple rules of 2s-complement arithmetic, so I disagree with calling it an error.

We're not working in Assembly here.  This is a high level language, designed to hide the complexities of the underlying processor.  The processor has no idea whether the data in its registers is signed or unsigned.  The high level language does.  Please use that knowledge to prevent stupid mistakes, or is that not one of the goals of the compiler?  I can't believe this is such a hard point to get across.

It's not like I'm proposing to change everything about 2s complement math *except* in this one small situation: Negation on an unsigned value *and* (this part is the most important) assigning it to (or passing it as) an unsigned value.  Any time you see that, it is an error or a misguided attempt to be clever.

>
>> The case I'm talking about is the equivalent to doing:
>>  x = x / 0;
>
> Even mathematicians don't know what to do about divide by zero. But 2's complement arithmetic is well defined. So the situations are not comparable.

Sure they do, the result is infinity.  It's well defined.  In fact, I think some math-based programming languages allow divide by zero.

Again, not arguing against 2s complement here, just the one particular situation which is always an error.

-Steve
February 16, 2010
On Tue, Feb 16, 2010 at 01:11:49PM -0600, Ellery Newcomer wrote:
> Awesome. It doesn't work for opPow, though.

Probably because I used assembly there, and opPow
wouldn't be implemented as a simple instruction.

Reimplementing it in that function and watching for
the overflow ourself would work, but would probably be
really slow.

> 
> And aha! put a static opCall in conjunction with opAssign and you can assign when declaring it. This ought to be in the docs somewhere where I can find it..

Oh yeah, that's right. I think it is in the struct page
of the website, but this is one of those things that's
easy to forget. I'd really like to see opCall change
in structs so it is more intuitive. Problem is, I
don't have a better proposal.

-- 
Adam D. Ruppe
http://arsdnet.net