View mode: basic / threaded / horizontal-split · Log in · Help
May 05, 2012
Integer overflow and underflow semantics
Hi,

I don't think the language really makes it clear whether overflows and 
underflows are well-defined. Do we guarantee that for any integral type 
T, T.max + 1 == T.min and T.min - 1 == T.max?

This is relevant in particular for GDC and LDC since they target a lot 
of weird architectures.

-- 
- Alex
May 05, 2012
Re: Integer overflow and underflow semantics
On Saturday, 5 May 2012 at 04:57:48 UTC, Alex Rønne Petersen 
wrote:
> Hi,
>
> I don't think the language really makes it clear whether 
> overflows and underflows are well-defined. Do we guarantee that 
> for any integral type T, T.max + 1 == T.min and T.min - 1 == 
> T.max?
>
> This is relevant in particular for GDC and LDC since they 
> target a lot of weird architectures.

 Any systems that implement a carry flag likely works exactly 
like that. Carry flag is set or cleared after a math operation, 
allowing you to extend the size of your integer any level you 
want, like in oh the 6502?

 Might look something like this: Been a while but you should get 
the idea

 clc ;clear carry
loop_here:
 mov ax, [bx]
 adc [dx], ax
 inc bx
 inc dx
 dec cx
 cmp cx, 0
 jne loop_here

 ;if carry after this point then...
 jc overflow_warning
May 05, 2012
Re: Integer overflow and underflow semantics
On 05-05-2012 07:32, Era Scarecrow wrote:
> On Saturday, 5 May 2012 at 04:57:48 UTC, Alex Rønne Petersen wrote:
>> Hi,
>>
>> I don't think the language really makes it clear whether overflows and
>> underflows are well-defined. Do we guarantee that for any integral
>> type T, T.max + 1 == T.min and T.min - 1 == T.max?
>>
>> This is relevant in particular for GDC and LDC since they target a lot
>> of weird architectures.
>
> Any systems that implement a carry flag likely works exactly like that.
> Carry flag is set or cleared after a math operation, allowing you to
> extend the size of your integer any level you want, like in oh the 6502?
>
> Might look something like this: Been a while but you should get the idea
>
> clc ;clear carry
> loop_here:
> mov ax, [bx]
> adc [dx], ax
> inc bx
> inc dx
> dec cx
> cmp cx, 0
> jne loop_here
>
> ;if carry after this point then...
> jc overflow_warning
>

Right, but the question was whether the language guarantees what I 
described. C and C++ don't, and IMO, we should strive to fix that.

-- 
- Alex
May 05, 2012
Re: Integer overflow and underflow semantics
On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen 
wrote:

> Right, but the question was whether the language guarantees 
> what I described. C and C++ don't, and IMO, we should strive to 
> fix that.

 I can't see why it wouldn't, unless the compiler adds in checks 
and changes it's behavior or the assembler does it's own quirky 
magic. The bit patterns of how they end up are pretty fixed, it's 
just how we interpret them.

 I know before when i needed to know if it overflowed and 
simulated the carry flag, i ended up using a larger type of int. 
(This was for a custom multiple precision unsigned ints)



 Of course now that I think about it, if anything ends up using 
MMX for it's registers, the rules do change a little. MMX don't 
overflow if I remember right, they just cap/truncate to the 
Max/Min. Since it's intended more for multimedia... If the 
compiler uses those, I can't tell you what would happen.
May 05, 2012
Re: Integer overflow and underflow semantics
On 05-05-2012 10:23, Era Scarecrow wrote:
> On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen wrote:
>
>> Right, but the question was whether the language guarantees what I
>> described. C and C++ don't, and IMO, we should strive to fix that.
>
> I can't see why it wouldn't, unless the compiler adds in checks and
> changes it's behavior or the assembler does it's own quirky magic. The
> bit patterns of how they end up are pretty fixed, it's just how we
> interpret them.

It all depends. GCC (and thus GDC) can target very exotic architectures 
where any assumptions may not, for whatever reason, hold true. This is a 
language design issue more than it's a "how does architecture X or 
compiler Y work" issue.

An interesting problem with undefined behavior for integer overflow and 
underflow in C/C++ is that optimizers are basically free to do anything 
with regards to them, and often do whatever is more convenient for them.

>
> I know before when i needed to know if it overflowed and simulated the
> carry flag, i ended up using a larger type of int. (This was for a
> custom multiple precision unsigned ints)
>
>
>
> Of course now that I think about it, if anything ends up using MMX for
> it's registers, the rules do change a little. MMX don't overflow if I
> remember right, they just cap/truncate to the Max/Min. Since it's
> intended more for multimedia... If the compiler uses those, I can't tell
> you what would happen.

Right, but it's not so much about what *would* happen as much as it is 
about what *should* happen. ;)

-- 
- Alex
May 05, 2012
Re: Integer overflow and underflow semantics
On 5 May 2012 11:42, Alex Rønne Petersen <xtzgzorex@gmail.com> wrote:

> On 05-05-2012 10:23, Era Scarecrow wrote:
>
>> On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen wrote:
>>
>>  Right, but the question was whether the language guarantees what I
>>> described. C and C++ don't, and IMO, we should strive to fix that.
>>>
>>
>> I can't see why it wouldn't, unless the compiler adds in checks and
>> changes it's behavior or the assembler does it's own quirky magic. The
>> bit patterns of how they end up are pretty fixed, it's just how we
>> interpret them.
>>
>
> It all depends. GCC (and thus GDC) can target very exotic architectures
> where any assumptions may not, for whatever reason, hold true. This is a
> language design issue more than it's a "how does architecture X or compiler
> Y work" issue.
>
> An interesting problem with undefined behavior for integer overflow and
> underflow in C/C++ is that optimizers are basically free to do anything
> with regards to them, and often do whatever is more convenient for them.


With regard to code-gen on such colourful architectures, would stating a
defined behaviour for overflow/underflow affect the common case where an
over/underflow did not occur?
Short of an architecture that causes hardware exception on over/underflow,
I suspect that it would interfere with the common case (additional code
generated around every add/sub/etc to handle the overflow behaviour), and
on such an architecture, every single numerical integer operation would
become inefficient.

I believe this is why C doesn't define the behaviour, because C is still
effectively a macro language, and shouldn't produce 'unexpected'
inefficient code. ('unexpected' from the perspective of the architecture
you're targeting)

I would personally rather see it remain undefined like C, but with
convention approved by common/conventional architectures where cross
platform porting is likely to occur.
May 05, 2012
Re: Integer overflow and underflow semantics
On 05-05-2012 12:25, Manu wrote:
> On 5 May 2012 11:42, Alex Rønne Petersen <xtzgzorex@gmail.com
> <mailto:xtzgzorex@gmail.com>> wrote:
>
>     On 05-05-2012 10 <tel:05-05-2012%2010>:23, Era Scarecrow wrote:
>
>         On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen wrote:
>
>             Right, but the question was whether the language guarantees
>             what I
>             described. C and C++ don't, and IMO, we should strive to fix
>             that.
>
>
>         I can't see why it wouldn't, unless the compiler adds in checks and
>         changes it's behavior or the assembler does it's own quirky
>         magic. The
>         bit patterns of how they end up are pretty fixed, it's just how we
>         interpret them.
>
>
>     It all depends. GCC (and thus GDC) can target very exotic
>     architectures where any assumptions may not, for whatever reason,
>     hold true. This is a language design issue more than it's a "how
>     does architecture X or compiler Y work" issue.
>
>     An interesting problem with undefined behavior for integer overflow
>     and underflow in C/C++ is that optimizers are basically free to do
>     anything with regards to them, and often do whatever is more
>     convenient for them.
>
>
> With regard to code-gen on such colourful architectures, would stating a
> defined behaviour for overflow/underflow affect the common case where an
> over/underflow did not occur?
> Short of an architecture that causes hardware exception on
> over/underflow, I suspect that it would interfere with the common case
> (additional code generated around every add/sub/etc to handle the
> overflow behaviour), and on such an architecture, every single numerical
> integer operation would become inefficient.

I don't know of a single architecture (out of the ones on 
dlang.org/version.html and many others) that doesn't signal 
overflow/underflow somehow (or obey the rules I described in the OP).

>
> I believe this is why C doesn't define the behaviour, because C is still
> effectively a macro language, and shouldn't produce 'unexpected'
> inefficient code. ('unexpected' from the perspective of the architecture
> you're targeting)

Right, but C was designed many, many years ago. :)

>
> I would personally rather see it remain undefined like C, but with
> convention approved by common/conventional architectures where cross
> platform porting is likely to occur.

Do a grep for "\.min" and "\.max" in Phobos and you'll notice a lot of 
places actually depend on the current x86 behavior (wrapping on overflow 
and underflow). I don't think we can afford to make the same mistake C 
and C++ did.

While I did say that this is a language design issue, it's also a matter 
of pragmatism - does any real world architecture that D could possibly 
target actually *not* obey the aforementioned rules? I don't know of any 
that doesn't.

-- 
- Alex
May 05, 2012
Re: Integer overflow and underflow semantics
Alex Rønne Petersen:

> I don't think the language really makes it clear whether 
> overflows and underflows are well-defined. Do we guarantee that 
> for any integral type T, T.max + 1 == T.min and T.min - 1 == 
> T.max?
>
> This is relevant in particular for GDC and LDC since they 
> target a lot of weird architectures.

In a good system language I'd like to see something better than 
what's present in C#. So I'd like the language to offer the 
programmer the choice of 3 or 4 different semantics in integral 
operations:

1) A shared standard semantics that overflows, as in Java;
2) A semantics that overflows, that adapts to the fastest 
available on the CPU, as in C;
3) Shared standard overflows with unsigned values and run-time 
errors when a signed value overflows (or goes out of its range).
4) Run-time errors when every signed or unsigned value overflows 
(or goes out of its range), as in Ada.

Bye,
bearophile
May 05, 2012
Re: Integer overflow and underflow semantics
On 05-05-2012 13:06, bearophile wrote:
> Alex Rønne Petersen:
>
>> I don't think the language really makes it clear whether overflows and
>> underflows are well-defined. Do we guarantee that for any integral
>> type T, T.max + 1 == T.min and T.min - 1 == T.max?
>>
>> This is relevant in particular for GDC and LDC since they target a lot
>> of weird architectures.
>
> In a good system language I'd like to see something better than what's
> present in C#. So I'd like the language to offer the programmer the
> choice of 3 or 4 different semantics in integral operations:
>
> 1) A shared standard semantics that overflows, as in Java;
> 2) A semantics that overflows, that adapts to the fastest available on
> the CPU, as in C;
> 3) Shared standard overflows with unsigned values and run-time errors
> when a signed value overflows (or goes out of its range).
> 4) Run-time errors when every signed or unsigned value overflows (or
> goes out of its range), as in Ada.
>
> Bye,
> bearophile

Good point. The wrapping rule described in my OP should be the default 
IMO. Then perhaps the other modes could be activated with an @attribute 
of sorts?

-- 
- Alex
May 06, 2012
Re: Integer overflow and underflow semantics
On 05-05-2012 06:57, Alex Rønne Petersen wrote:
> Hi,
>
> I don't think the language really makes it clear whether overflows and
> underflows are well-defined. Do we guarantee that for any integral type
> T, T.max + 1 == T.min and T.min - 1 == T.max?
>
> This is relevant in particular for GDC and LDC since they target a lot
> of weird architectures.
>

Can anyone give a definitive answer to this or at least confirm that it 
is an open issue?

-- 
- Alex
« First   ‹ Prev
1 2 3
Top | Discussion index | About this forum | D home