Integer overflow and underflow semantics? (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Integer overflow and underflow semantics? (page 3)

July 21, 2014

Re: Integer overflow and underflow semantics?

Posted by Ola Fosheim Grøstad
in reply to Artur Skawina

Ola Fosheim Grøstad

Posted in reply to Artur Skawina

On Monday, 21 July 2014 at 19:33:32 UTC, Artur Skawina via Digitalmars-d wrote:
> Disallowing integer overflow just at CT is not (sanely) possible
> in a language with D's CTFE capabilities. (Would result in code
> that compiles and works at runtime, but is not ctfe-able)

I'd like to see compile time _constants_ be unbounded rational numbers with explicit truncation. It is when you assign it to an in-memory location that you need to worry about bounds. The same goes for calculations that doesn't do division.

No need to copy the bad parts of C.

July 21, 2014

Re: Integer overflow and underflow semantics?

Posted by Artur Skawina
in reply to Ola Fosheim Grøstad

Artur Skawina

Posted in reply to Ola Fosheim Grøstad

On 07/21/14 21:53, via Digitalmars-d wrote:
> On Monday, 21 July 2014 at 19:33:32 UTC, Artur Skawina via Digitalmars-d wrote:
>> Disallowing integer overflow just at CT is not (sanely) possible
>> in a language with D's CTFE capabilities. (Would result in code
>> that compiles and works at runtime, but is not ctfe-able)
> 
> I'd like to see compile time _constants_ be unbounded rational numbers with explicit truncation. It is when you assign it to an in-memory location that you need to worry about bounds. The same goes for calculations that doesn't do division.
> 
> No need to copy the bad parts of C.

Actually, C/C++ could get away with treating overflow during constant
folding as an error (or at least emitting a warning) because of the
lack of CTFE (and no templates in C's case). The code will either
compile or it won't.
For D that is not possible -- if an expression is valid at run-time
then it should be valid at compile-time (and obviously yield the same
value). Making this aspect of CT evaluation special would make CTFE
much less useful and add complexity to the language for very little gain.
Trying to handle just a subset of the problem would make things even
worse -- /some/ code would not be CTFE-able and /some/ overflows wouldn't
be caught.

   int f(int a, int b) { return a*b; }
   enum v = f(100_000, 100_000);

artur

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Basile Burg
in reply to bearophile

Basile Burg

Posted in reply to bearophile

On Monday, 21 July 2014 at 14:32:38 UTC, bearophile wrote:
> Basile Burg:
>
>> If you still feel ok today then dont read this:
>> -----------------
>> module meh;
>>
>> import std.stdio;
>>
>> //https://stackoverflow.com/questions/24676375/why-does-int-i-1024-1024-1024-1024-compile-without-error
>>
>> static shared immutable int o = 1024 * 1024 * 1024 * 1024;
>>
>> void main(string args[])
>> {
>>    writeln(o);
>> }
>> -------------------------------------------------------------
>
> See:
> https://issues.dlang.org/show_bug.cgi?id=4835
> https://github.com/D-Programming-Language/dmd/pull/1803
>
> Bye,
> bearophile

oOPS...I've just missed an oportunity to shut up my mouth...However, I'm glad to see that someone else noticed that the real issue is that it's a <<const>>.

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Ola Fosheim Grøstad
in reply to Artur Skawina

Ola Fosheim Grøstad

Posted in reply to Artur Skawina

On Monday, 21 July 2014 at 21:10:43 UTC, Artur Skawina via Digitalmars-d wrote:

> For D that is not possible -- if an expression is valid at run-time
> then it should be valid at compile-time (and obviously yield the same
> value). Making this aspect of CT evaluation special would make CTFE
> much less useful and add complexity to the language for very little gain.

CT and runtime give different results for floats.
Overflow in the end result without explicit truncation should be considered a bug. Bugs can yield different results.

Overflow checks on add/sub expressions mess up reordering optimizations. You only care about overflows in the end result.

Exact, truncating, masking/wrapping or saturating math results should be explicit. (It is a flaw to use the same div operator for floats and ints.) It should be the programmers resposibility to provide the proofs or turn on extra precision in debug mode.

Turning off reordering optimizations and add checks ought to be the rare case for both ints and floats.

Ideally all ctfe would be done as real intervals with rational bounds, then checked against the specified precision of the end result (or numerically solving the whole expression to the specified precision).

> Trying to handle just a subset of the problem would make things even
> worse -- /some/ code would not be CTFE-able and /some/ overflows wouldn't
> be caught.
>
>    int f(int a, int b) { return a*b; }
>    enum v = f(100_000, 100_000);

NUMBER f(NUMBER a, NUMBER b) ...

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Iain Buclaw

Iain Buclaw

Attachments:

text/html part

On 21 Jul 2014 22:10, "Artur Skawina via Digitalmars-d" < digitalmars-d@puremagic.com> wrote:
>
> On 07/21/14 21:53, via Digitalmars-d wrote:
> > On Monday, 21 July 2014 at 19:33:32 UTC, Artur Skawina via
Digitalmars-d wrote:
> >> Disallowing integer overflow just at CT is not (sanely) possible
> >> in a language with D's CTFE capabilities. (Would result in code
> >> that compiles and works at runtime, but is not ctfe-able)
> >
> > I'd like to see compile time _constants_ be unbounded rational numbers
with explicit truncation. It is when you assign it to an in-memory location that you need to worry about bounds. The same goes for calculations that doesn't do division.
> >
> > No need to copy the bad parts of C.
>
> Actually, C/C++ could get away with treating overflow during constant
> folding as an error (or at least emitting a warning) because of the
> lack of CTFE (and no templates in C's case). The code will either
> compile or it won't.
> For D that is not possible -- if an expression is valid at run-time
> then it should be valid at compile-time (and obviously yield the same
> value).

...most of the time.

CTFE is allowed to do things at an arbitrary precision in mid-flight when evaluating an expression.

Iain

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Artur Skawina
in reply to Ola Fosheim Grøstad

Artur Skawina

Posted in reply to Ola Fosheim Grøstad

On 07/22/14 05:12, via Digitalmars-d wrote:
> On Monday, 21 July 2014 at 21:10:43 UTC, Artur Skawina via Digitalmars-d wrote:
> 
>> For D that is not possible -- if an expression is valid at run-time
>> then it should be valid at compile-time (and obviously yield the same
>> value). Making this aspect of CT evaluation special would make CTFE
>> much less useful and add complexity to the language for very little gain.
> 
> CT and runtime give different results for floats.

Both CT and RT evaluation must yield correct results, where "correct"
means "as specified". If RT FP is allowed to use extra precision (or
is otherwise loosely specified) then this also applies to CT FP.
But integer overflow _is_ defined in D (unlike in eg C), so CT has to
obey the exact same rules as RT. Would you really like to use a language
in which 'enum x = (a+b)/2;' and 'immutable x = (a+b)/2;' results in
different values?... And functions containing such 'a+b' expressions,
which rely on wrapping arithmetic, are not usable at CT?...

> Overflow in the end result without explicit truncation should be considered a bug. Bugs can yield different results.

Integer overflow is defined in D. It's not a bug. It can be relied upon.
(Well, now it can, after Iain recently fixed GDC ;) )

> Overflow checks on add/sub expressions mess up reordering optimizations. You only care about overflows in the end result.

This would be an argument _against_ introducing the checks.

> Exact, truncating, masking/wrapping or saturating math results should be explicit.

That's how it is in D - the arguments are only about the /default/, and in this case about /using a different default at CT and RT/. Using a non-wrapping default would be a bad idea (perf implications, both direct and indirect - bounds checking would make certain optimizations invalid), and using different evaluation modes for CT and RT would be, well, insane.

> Ideally all ctfe would be done as real intervals with rational bounds, then checked against the specified precision of the end result (or numerically solving the whole expression to the specified precision).

Not possible (for integers), unless you'd be ok with getting different
results at CT.

>> Trying to handle just a subset of the problem would make things even worse -- /some/ code would not be CTFE-able and /some/ overflows wouldn't be caught.
>>
>>    int f(int a, int b) { return a*b; }
>>    enum v = f(100_000, 100_000);
> 
> NUMBER f(NUMBER a, NUMBER b) ...

Not sure what you mean here. 'f' is a perfectly fine existing function, which is used at RT. It needs to be usable at CT as is. The power of D's CTFE comes from being able to execute normal D code and not having to use a different dialect.

artur

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Artur Skawina

Artur Skawina

On 07/22/14 08:15, Iain Buclaw via Digitalmars-d wrote:
> On 21 Jul 2014 22:10, "Artur Skawina via Digitalmars-d" <digitalmars-d@puremagic.com <mailto:digitalmars-d@puremagic.com>> wrote:
>> For D that is not possible -- if an expression is valid at run-time then it should be valid at compile-time (and obviously yield the same value).
> 
> ...most of the time.
> 
> CTFE is allowed to do things at an arbitrary precision in mid-flight when evaluating an expression.

That will work for FP, where excess precision is allowed, but will not work for integer arithmetic. Consider code which uses hashing and hash-folding functions which rely on wrapping arithmetic. If you increase the precision then those functions will yield different values. Now a hash value calculated at CT is invalid at RT...

artur

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Ola Fosheim Grøstad
in reply to Artur Skawina

Ola Fosheim Grøstad

Posted in reply to Artur Skawina

On Tuesday, 22 July 2014 at 11:40:08 UTC, Artur Skawina via Digitalmars-d wrote:
> obey the exact same rules as RT. Would you really like to use a language
> in which 'enum x = (a+b)/2;' and 'immutable x = (a+b)/2;' results in
> different values?...

With the exception of hash-functions the result will be wrong if you don't predict that the value is wrapping. If you do, I think you should make the masking explicit e.g. specifying '(a+b)&0xffffffff' or something similar, which the optimizer can reduce to a single addition.

> That's how it is in D - the arguments are only about the /default/, and in
> this case about /using a different default at CT and RT/. Using a non-wrapping
> default would be a bad idea (perf implications, both direct and

Yes, but there is a difference between saying "it is ok that it wraps on addition, but it shouldn't overflow before a store takes place" and "it should be masked to N bits or fail on overflow even though the end-result is known to be correct". A system level language should encourage using the fastest opcode, so you shouldn't enforce 32 bit masking when the fastest register size is 64 bit etc. It should also encourage reordering so you get to use efficient SIMDy instructions.

> Not possible (for integers), unless you'd be ok with getting different
> results at CT.

You don't get different results at compile time if you are explicit about wrapping.

>> NUMBER f(NUMBER a, NUMBER b) ...
>
> Not sure what you mean here. 'f' is a perfectly fine existing
> function, which is used at RT. It needs to be usable at CT as is.

D claims to focus generic programming. So it should also encourage pure functions that can be specified for floats, ints and other numeric types that are subtypes of (true) reals in the same clean definition.

If you express the expression in a clean way to get down to the actual (more limited type) then the optimizer sometimes can pick an efficient sequence of instructions that might be a very fast approximation if you reduce the precision sufficiently in the end-result.

To get there you need to differentiate between a truncating division and a non-truncating division etc.

The philosophy behind generic programming and the requirements for efficient generic programming is quite different from the the machine-level hand optimizing philosophy of classic C, IMO.

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Iain Buclaw

Iain Buclaw

On 22 July 2014 12:40, Artur Skawina via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On 07/22/14 08:15, Iain Buclaw via Digitalmars-d wrote:
>> On 21 Jul 2014 22:10, "Artur Skawina via Digitalmars-d" <digitalmars-d@puremagic.com <mailto:digitalmars-d@puremagic.com>> wrote:
>>> For D that is not possible -- if an expression is valid at run-time then it should be valid at compile-time (and obviously yield the same value).
>>
>> ...most of the time.
>>
>> CTFE is allowed to do things at an arbitrary precision in mid-flight when evaluating an expression.
>
> That will work for FP, where excess precision is allowed, but will not work for integer arithmetic. Consider code which uses hashing and hash-folding functions which rely on wrapping arithmetic. If you increase the precision then those functions will yield different values. Now a hash value calculated at CT is invalid at RT...
>
> artur

I can still imagine a possibility of such occurring if cross-compiling from a (doesn't exist) platform at does integer operations at 128bit to x86, which at runtime is 64bit.

This is just brushing on the idea of a potential porting bug rather than an actual problem.

Iain

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Artur Skawina
in reply to Ola Fosheim Grøstad

Artur Skawina

Posted in reply to Ola Fosheim Grøstad

On 07/22/14 17:31, via Digitalmars-d wrote:
> On Tuesday, 22 July 2014 at 11:40:08 UTC, Artur Skawina via Digitalmars-d wrote:
>> obey the exact same rules as RT. Would you really like to use a language
>> in which 'enum x = (a+b)/2;' and 'immutable x = (a+b)/2;' results in
>> different values?...
> 
> With the exception of hash-functions the result will be wrong if you don't predict that the value is wrapping. If you do, I think you should make the masking explicit e.g. specifying '(a+b)&0xffffffff' or something similar, which the optimizer can reduce to a single addition.

D is defined as it is, with wrapping two's complement integer arithmetic and defined integer sizes.

My point is that the language must be consistent; adding special cases would create a language in which one expression yields several different results, depending on evaluation context. That would be a very significant regression, and would severely cripple the language. Maybe the harm done by that particular pull request wouldn't be catastrophic, but it would be a step in a very dangerous direction.

artur

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation