Integer overflow and underflow semantics - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Integer overflow and underflow semantics

Thread overview

Integer overflow and underflow semantics
May 05, 2012 Alex Rønne Petersen
May 05, 2012 Era Scarecrow
May 05, 2012 Alex Rønne Petersen
May 05, 2012 Era Scarecrow
May 05, 2012 Alex Rønne Petersen
May 05, 2012 Manu
May 05, 2012 Alex Rønne Petersen
May 06, 2012 Mehrdad
May 06, 2012 Era Scarecrow
May 05, 2012 bearophile
May 05, 2012 Alex Rønne Petersen
May 06, 2012 Alex Rønne Petersen
May 18, 2012 Paul D. Anderson
May 18, 2012 Paul D. Anderson
May 21, 2012 renoX
May 07, 2012 Don Clugston
May 07, 2012 Alex Rønne Petersen
May 09, 2012 renoX
May 14, 2012 Alex Rønne Petersen
May 18, 2012 akaz
May 18, 2012 Alex Rønne Petersen
May 18, 2012 akaz
May 18, 2012 Alex Rønne Petersen
May 18, 2012 Andrew Wiley

May 05, 2012

Integer overflow and underflow semantics

Posted by Alex Rønne Petersen

Alex Rønne Petersen

Hi,

I don't think the language really makes it clear whether overflows and underflows are well-defined. Do we guarantee that for any integral type T, T.max + 1 == T.min and T.min - 1 == T.max?

This is relevant in particular for GDC and LDC since they target a lot of weird architectures.

-- 
- Alex

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by Era Scarecrow
in reply to Alex Rønne Petersen

Era Scarecrow

Posted in reply to Alex Rønne Petersen

On Saturday, 5 May 2012 at 04:57:48 UTC, Alex Rønne Petersen wrote:
> Hi,
>
> I don't think the language really makes it clear whether overflows and underflows are well-defined. Do we guarantee that for any integral type T, T.max + 1 == T.min and T.min - 1 == T.max?
>
> This is relevant in particular for GDC and LDC since they target a lot of weird architectures.

 Any systems that implement a carry flag likely works exactly like that. Carry flag is set or cleared after a math operation, allowing you to extend the size of your integer any level you want, like in oh the 6502?

 Might look something like this: Been a while but you should get the idea

 clc ;clear carry
loop_here:
 mov ax, [bx]
 adc [dx], ax
 inc bx
 inc dx
 dec cx
 cmp cx, 0
 jne loop_here

 ;if carry after this point then...
 jc overflow_warning

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by Alex Rønne Petersen
in reply to Era Scarecrow

Alex Rønne Petersen

Posted in reply to Era Scarecrow

On 05-05-2012 07:32, Era Scarecrow wrote:
> On Saturday, 5 May 2012 at 04:57:48 UTC, Alex Rønne Petersen wrote:
>> Hi,
>>
>> I don't think the language really makes it clear whether overflows and
>> underflows are well-defined. Do we guarantee that for any integral
>> type T, T.max + 1 == T.min and T.min - 1 == T.max?
>>
>> This is relevant in particular for GDC and LDC since they target a lot
>> of weird architectures.
>
> Any systems that implement a carry flag likely works exactly like that.
> Carry flag is set or cleared after a math operation, allowing you to
> extend the size of your integer any level you want, like in oh the 6502?
>
> Might look something like this: Been a while but you should get the idea
>
> clc ;clear carry
> loop_here:
> mov ax, [bx]
> adc [dx], ax
> inc bx
> inc dx
> dec cx
> cmp cx, 0
> jne loop_here
>
> ;if carry after this point then...
> jc overflow_warning
>

Right, but the question was whether the language guarantees what I described. C and C++ don't, and IMO, we should strive to fix that.

-- 
- Alex

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by Era Scarecrow
in reply to Alex Rønne Petersen

Era Scarecrow

Posted in reply to Alex Rønne Petersen

On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen wrote:

> Right, but the question was whether the language guarantees what I described. C and C++ don't, and IMO, we should strive to fix that.

 I can't see why it wouldn't, unless the compiler adds in checks and changes it's behavior or the assembler does it's own quirky magic. The bit patterns of how they end up are pretty fixed, it's just how we interpret them.

 I know before when i needed to know if it overflowed and simulated the carry flag, i ended up using a larger type of int. (This was for a custom multiple precision unsigned ints)



 Of course now that I think about it, if anything ends up using MMX for it's registers, the rules do change a little. MMX don't overflow if I remember right, they just cap/truncate to the Max/Min. Since it's intended more for multimedia... If the compiler uses those, I can't tell you what would happen.

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by Alex Rønne Petersen
in reply to Era Scarecrow

Alex Rønne Petersen

Posted in reply to Era Scarecrow

On 05-05-2012 10:23, Era Scarecrow wrote:
> On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen wrote:
>
>> Right, but the question was whether the language guarantees what I
>> described. C and C++ don't, and IMO, we should strive to fix that.
>
> I can't see why it wouldn't, unless the compiler adds in checks and
> changes it's behavior or the assembler does it's own quirky magic. The
> bit patterns of how they end up are pretty fixed, it's just how we
> interpret them.

It all depends. GCC (and thus GDC) can target very exotic architectures where any assumptions may not, for whatever reason, hold true. This is a language design issue more than it's a "how does architecture X or compiler Y work" issue.

An interesting problem with undefined behavior for integer overflow and underflow in C/C++ is that optimizers are basically free to do anything with regards to them, and often do whatever is more convenient for them.

>
> I know before when i needed to know if it overflowed and simulated the
> carry flag, i ended up using a larger type of int. (This was for a
> custom multiple precision unsigned ints)
>
>
>
> Of course now that I think about it, if anything ends up using MMX for
> it's registers, the rules do change a little. MMX don't overflow if I
> remember right, they just cap/truncate to the Max/Min. Since it's
> intended more for multimedia... If the compiler uses those, I can't tell
> you what would happen.

Right, but it's not so much about what *would* happen as much as it is about what *should* happen. ;)

-- 
- Alex

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by Manu
in reply to Alex Rønne Petersen

Manu

Posted in reply to Alex Rønne Petersen

Attachments:

text/html part

On 5 May 2012 11:42, Alex Rønne Petersen <xtzgzorex@gmail.com> wrote:

> On 05-05-2012 10:23, Era Scarecrow wrote:
>
>> On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen wrote:
>>
>>  Right, but the question was whether the language guarantees what I
>>> described. C and C++ don't, and IMO, we should strive to fix that.
>>>
>>
>> I can't see why it wouldn't, unless the compiler adds in checks and changes it's behavior or the assembler does it's own quirky magic. The bit patterns of how they end up are pretty fixed, it's just how we interpret them.
>>
>
> It all depends. GCC (and thus GDC) can target very exotic architectures where any assumptions may not, for whatever reason, hold true. This is a language design issue more than it's a "how does architecture X or compiler Y work" issue.
>
> An interesting problem with undefined behavior for integer overflow and underflow in C/C++ is that optimizers are basically free to do anything with regards to them, and often do whatever is more convenient for them.

With regard to code-gen on such colourful architectures, would stating a
defined behaviour for overflow/underflow affect the common case where an
over/underflow did not occur?
Short of an architecture that causes hardware exception on over/underflow,
I suspect that it would interfere with the common case (additional code
generated around every add/sub/etc to handle the overflow behaviour), and
on such an architecture, every single numerical integer operation would
become inefficient.

I believe this is why C doesn't define the behaviour, because C is still effectively a macro language, and shouldn't produce 'unexpected' inefficient code. ('unexpected' from the perspective of the architecture you're targeting)

I would personally rather see it remain undefined like C, but with convention approved by common/conventional architectures where cross platform porting is likely to occur.

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by Alex Rønne Petersen
in reply to Manu

Alex Rønne Petersen

Posted in reply to Manu

On 05-05-2012 12:25, Manu wrote:
> On 5 May 2012 11:42, Alex Rønne Petersen <xtzgzorex@gmail.com
> <mailto:xtzgzorex@gmail.com>> wrote:
>
>     On 05-05-2012 10 <tel:05-05-2012%2010>:23, Era Scarecrow wrote:
>
>         On Saturday, 5 May 2012 at 07:10:28 UTC, Alex Rønne Petersen wrote:
>
>             Right, but the question was whether the language guarantees
>             what I
>             described. C and C++ don't, and IMO, we should strive to fix
>             that.
>
>
>         I can't see why it wouldn't, unless the compiler adds in checks and
>         changes it's behavior or the assembler does it's own quirky
>         magic. The
>         bit patterns of how they end up are pretty fixed, it's just how we
>         interpret them.
>
>
>     It all depends. GCC (and thus GDC) can target very exotic
>     architectures where any assumptions may not, for whatever reason,
>     hold true. This is a language design issue more than it's a "how
>     does architecture X or compiler Y work" issue.
>
>     An interesting problem with undefined behavior for integer overflow
>     and underflow in C/C++ is that optimizers are basically free to do
>     anything with regards to them, and often do whatever is more
>     convenient for them.
>
>
> With regard to code-gen on such colourful architectures, would stating a
> defined behaviour for overflow/underflow affect the common case where an
> over/underflow did not occur?
> Short of an architecture that causes hardware exception on
> over/underflow, I suspect that it would interfere with the common case
> (additional code generated around every add/sub/etc to handle the
> overflow behaviour), and on such an architecture, every single numerical
> integer operation would become inefficient.

I don't know of a single architecture (out of the ones on dlang.org/version.html and many others) that doesn't signal overflow/underflow somehow (or obey the rules I described in the OP).

>
> I believe this is why C doesn't define the behaviour, because C is still
> effectively a macro language, and shouldn't produce 'unexpected'
> inefficient code. ('unexpected' from the perspective of the architecture
> you're targeting)

Right, but C was designed many, many years ago. :)

>
> I would personally rather see it remain undefined like C, but with
> convention approved by common/conventional architectures where cross
> platform porting is likely to occur.

Do a grep for "\.min" and "\.max" in Phobos and you'll notice a lot of places actually depend on the current x86 behavior (wrapping on overflow and underflow). I don't think we can afford to make the same mistake C and C++ did.

While I did say that this is a language design issue, it's also a matter of pragmatism - does any real world architecture that D could possibly target actually *not* obey the aforementioned rules? I don't know of any that doesn't.

-- 
- Alex

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by bearophile
in reply to Alex Rønne Petersen

bearophile

Posted in reply to Alex Rønne Petersen

Alex Rønne Petersen:

> I don't think the language really makes it clear whether overflows and underflows are well-defined. Do we guarantee that for any integral type T, T.max + 1 == T.min and T.min - 1 == T.max?
>
> This is relevant in particular for GDC and LDC since they target a lot of weird architectures.

In a good system language I'd like to see something better than what's present in C#. So I'd like the language to offer the programmer the choice of 3 or 4 different semantics in integral operations:

1) A shared standard semantics that overflows, as in Java;
2) A semantics that overflows, that adapts to the fastest available on the CPU, as in C;
3) Shared standard overflows with unsigned values and run-time errors when a signed value overflows (or goes out of its range).
4) Run-time errors when every signed or unsigned value overflows (or goes out of its range), as in Ada.

Bye,
bearophile

May 05, 2012

Re: Integer overflow and underflow semantics

Posted by Alex Rønne Petersen
in reply to bearophile

Alex Rønne Petersen

Posted in reply to bearophile

On 05-05-2012 13:06, bearophile wrote:
> Alex Rønne Petersen:
>
>> I don't think the language really makes it clear whether overflows and
>> underflows are well-defined. Do we guarantee that for any integral
>> type T, T.max + 1 == T.min and T.min - 1 == T.max?
>>
>> This is relevant in particular for GDC and LDC since they target a lot
>> of weird architectures.
>
> In a good system language I'd like to see something better than what's
> present in C#. So I'd like the language to offer the programmer the
> choice of 3 or 4 different semantics in integral operations:
>
> 1) A shared standard semantics that overflows, as in Java;
> 2) A semantics that overflows, that adapts to the fastest available on
> the CPU, as in C;
> 3) Shared standard overflows with unsigned values and run-time errors
> when a signed value overflows (or goes out of its range).
> 4) Run-time errors when every signed or unsigned value overflows (or
> goes out of its range), as in Ada.
>
> Bye,
> bearophile

Good point. The wrapping rule described in my OP should be the default IMO. Then perhaps the other modes could be activated with an @attribute of sorts?

-- 
- Alex

May 06, 2012

Re: Integer overflow and underflow semantics

Posted by Alex Rønne Petersen
in reply to Alex Rønne Petersen

Alex Rønne Petersen

Posted in reply to Alex Rønne Petersen

On 05-05-2012 06:57, Alex Rønne Petersen wrote:
> Hi,
>
> I don't think the language really makes it clear whether overflows and
> underflows are well-defined. Do we guarantee that for any integral type
> T, T.max + 1 == T.min and T.min - 1 == T.max?
>
> This is relevant in particular for GDC and LDC since they target a lot
> of weird architectures.
>

Can anyone give a definitive answer to this or at least confirm that it is an open issue?

-- 
- Alex

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation