Integer overflow and underflow semantics? (page 4) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Integer overflow and underflow semantics? (page 4)

July 22, 2014

Re: Integer overflow and underflow semantics?

Posted by Artur Skawina

Artur Skawina

On 07/22/14 18:39, Iain Buclaw via Digitalmars-d wrote:
> On 22 July 2014 12:40, Artur Skawina via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>> On 07/22/14 08:15, Iain Buclaw via Digitalmars-d wrote:
>>> On 21 Jul 2014 22:10, "Artur Skawina via Digitalmars-d" <digitalmars-d@puremagic.com <mailto:digitalmars-d@puremagic.com>> wrote:
>>>> For D that is not possible -- if an expression is valid at run-time then it should be valid at compile-time (and obviously yield the same value).
>>>
>>> ...most of the time.
>>>
>>> CTFE is allowed to do things at an arbitrary precision in mid-flight when evaluating an expression.
>>
>> That will work for FP, where excess precision is allowed, but will not work for integer arithmetic. Consider code which uses hashing and hash-folding functions which rely on wrapping arithmetic. If you increase the precision then those functions will yield different values. Now a hash value calculated at CT is invalid at RT...
> 
> I can still imagine a possibility of such occurring if cross-compiling from a (doesn't exist) platform at does integer operations at 128bit to x86, which at runtime is 64bit.

In D integer widths are well defined; exposing the larger range would not be possible.

   static assert (100_000^^2!=100_000L^^2);

[Whether requiring specific integer widths was a good idea or not,
 redefining them /now/ is obviously not a practical option.]


artur

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Ola Fosheim Grøstad
in reply to Artur Skawina

Ola Fosheim Grøstad

Posted in reply to Artur Skawina

On Tuesday, 22 July 2014 at 21:06:09 UTC, Artur Skawina via Digitalmars-d wrote:
> D is defined as it is, with wrapping two's complement integer arithmetic
> and defined integer sizes.

Integer promotion is locked to 32 bits. That is a mistake. Why wrap everything below 32bit at 32 on a 64 bit ALU? That's inconvinient and will lead to undetected bugs.

I also think it is a mistake to lock to C rules, which were defined when multiply often were done in software. In most modern ALUs a N bit multiply yields a 2*N bit result. Why discard the high word?

With forced wrapping/masking (a*b)>>32 is turned into ((a*b)&0xffffffff)>>32 which is zero, so you have to cast 'a' to 64 bit before the multiply, then downcast the result to 32 bit.

> My point is that the language must be consistent; adding special
> cases would create a language in which one expression yields several
> different results, depending on evaluation context.

I understand this point, but I think code that would yield such errors most lkely is buggy or underspecified.

Ola.

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Don
in reply to Artur Skawina

Don

Posted in reply to Artur Skawina

On Monday, 21 July 2014 at 21:10:43 UTC, Artur Skawina via Digitalmars-d wrote:
> On 07/21/14 21:53, via Digitalmars-d wrote:
>> On Monday, 21 July 2014 at 19:33:32 UTC, Artur Skawina via Digitalmars-d wrote:
>>> Disallowing integer overflow just at CT is not (sanely) possible
>>> in a language with D's CTFE capabilities. (Would result in code
>>> that compiles and works at runtime, but is not ctfe-able)
>> 
>> I'd like to see compile time _constants_ be unbounded rational numbers with explicit truncation. It is when you assign it to an in-memory location that you need to worry about bounds. The same goes for calculations that doesn't do division.
>> 
>> No need to copy the bad parts of C.
>
> Actually, C/C++ could get away with treating overflow during constant
> folding as an error (or at least emitting a warning) because of the
> lack of CTFE (and no templates in C's case). The code will either
> compile or it won't.
> For D that is not possible -- if an expression is valid at run-time
> then it should be valid at compile-time

Why do you think that? There are many cases where that is not true. Comparing pointers to two unrelated objects will work at runtime, but causes an error in CTFE. You can read global variables at runtime, not in CTFE. Etc.

The converse is true, though -- if it works at CTFE, it must work at runtime.

Disallowing integer overflow in CTFE could certainly be implemented. It's not a difficult experiment to run. It would be interesting to see how many instances of overflow are bugs, and how many are intentional.

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Walter Bright
in reply to Iain Buclaw

Walter Bright

Posted in reply to Iain Buclaw

On 7/21/2014 11:15 PM, Iain Buclaw via Digitalmars-d wrote:
> CTFE is allowed to do things at an arbitrary precision in mid-flight when
> evaluating an expression.

For floating point, yes. Not for integral arithmetic.

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Walter Bright
in reply to Artur Skawina

Walter Bright

Posted in reply to Artur Skawina

On 7/21/2014 2:10 PM, Artur Skawina via Digitalmars-d wrote:
> Actually, C/C++ could get away with treating overflow during constant
> folding as an error (or at least emitting a warning) because of the
> lack of CTFE (and no templates in C's case). The code will either
> compile or it won't.
> For D that is not possible -- if an expression is valid at run-time
> then it should be valid at compile-time (and obviously yield the same
> value). Making this aspect of CT evaluation special would make CTFE
> much less useful and add complexity to the language for very little gain.
> Trying to handle just a subset of the problem would make things even
> worse -- /some/ code would not be CTFE-able and /some/ overflows wouldn't
> be caught.
>
>     int f(int a, int b) { return a*b; }
>     enum v = f(100_000, 100_000);

One difficulty with breaking with C rules is we are working with optimizers and code generators developed for C. Coming up with different semantics for D may cause all sorts of problems.

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Don
in reply to Ola Fosheim Grøstad

Don

Posted in reply to Ola Fosheim Grøstad

On Tuesday, 22 July 2014 at 15:31:22 UTC, Ola Fosheim Grøstad wrote:
> On Tuesday, 22 July 2014 at 11:40:08 UTC, Artur Skawina via Digitalmars-d wrote:
>> obey the exact same rules as RT. Would you really like to use a language
>> in which 'enum x = (a+b)/2;' and 'immutable x = (a+b)/2;' results in
>> different values?...
>
> With the exception of hash-functions the result will be wrong if you don't predict that the value is wrapping. If you do, I think you should make the masking explicit e.g. specifying '(a+b)&0xffffffff' or something similar, which the optimizer can reduce to a single addition.
>
>> That's how it is in D - the arguments are only about the /default/, and in
>> this case about /using a different default at CT and RT/. Using a non-wrapping
>> default would be a bad idea (perf implications, both direct and
>
> Yes, but there is a difference between saying "it is ok that it wraps on addition, but it shouldn't overflow before a store takes place" and "it should be masked to N bits or fail on overflow even though the end-result is known to be correct". A system level language should encourage using the fastest opcode, so you shouldn't enforce 32 bit masking when the fastest register size is 64 bit etc. It should also encourage reordering so you get to use efficient SIMDy instructions.
>
>> Not possible (for integers), unless you'd be ok with getting different
>> results at CT.
>
> You don't get different results at compile time if you are explicit about wrapping.
>
>>> NUMBER f(NUMBER a, NUMBER b) ...
>>
>> Not sure what you mean here. 'f' is a perfectly fine existing
>> function, which is used at RT. It needs to be usable at CT as is.
>
> D claims to focus generic programming. So it should also encourage pure functions that can be specified for floats, ints and other numeric types that are subtypes of (true) reals in the same clean definition.

I think it's a complete fantasy to think you can write generic code that will work for both floats and ints. The algorithms are completely different.

One of the simplest examples is that given float f;  int i;

(f + 1) and  (i +  1)  have totally different semantics.

There are no values of i for which i + 1 == i,
but if abs(f) > 1/real.epsilon, then f + 1 == f.

Likewise there is no value of i for which i != 0 && i+1 == 1,
but for any abs(f) < real.epsilon, f + 1 == 1.


> If you express the expression in a clean way to get down to the actual (more limited type) then the optimizer sometimes can pick an efficient sequence of instructions that might be a very fast approximation if you reduce the precision sufficiently in the end-result.

>
> To get there you need to differentiate between a truncating division and a non-truncating division etc.

Well, it's not a small number of differences. Almost every operation is different. Maybe all of them. I can't actually think of a single operation where the semantics are the same for integers and floating point.

Negation comes close, but even then you have the special cases -0.0 and -(-int.max - 1).


> The philosophy behind generic programming and the requirements for efficient generic programming is quite different from the the machine-level hand optimizing philosophy of classic C, IMO.

I think that unfortunately, it's a quest that is doomed to fail. Producing generic code that works for both floats and ints is a fool's errand.

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Kagamin
in reply to Ola Fosheim Grøstad

Kagamin

Posted in reply to Ola Fosheim Grøstad

On Tuesday, 22 July 2014 at 15:31:22 UTC, Ola Fosheim Grøstad wrote:
> A system level language should encourage using the fastest opcode, so you shouldn't enforce 32 bit masking when the fastest register size is 64 bit etc.

This is what int_fast32_t is for, but unfortunately it's not guaranteed to be the fastest, but you can use something similar.

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Walter Bright
in reply to Don

Walter Bright

Posted in reply to Don

On 7/23/2014 12:49 AM, Don wrote:
> On Tuesday, 22 July 2014 at 15:31:22 UTC, Ola Fosheim Grøstad wrote:
>> D claims to focus generic programming. So it should also encourage pure
>> functions that can be specified for floats, ints and other numeric types that
>> are subtypes of (true) reals in the same clean definition.
>
> I think it's a complete fantasy to think you can write generic code that will
> work for both floats and ints. The algorithms are completely different.
>
> One of the simplest examples is that given float f;  int i;
>
> (f + 1) and  (i +  1)  have totally different semantics.
>
> There are no values of i for which i + 1 == i,
> but if abs(f) > 1/real.epsilon, then f + 1 == f.
>
> Likewise there is no value of i for which i != 0 && i+1 == 1,
> but for any abs(f) < real.epsilon, f + 1 == 1.
>
>
>> If you express the expression in a clean way to get down to the actual (more
>> limited type) then the optimizer sometimes can pick an efficient sequence of
>> instructions that might be a very fast approximation if you reduce the
>> precision sufficiently in the end-result.
>
>>
>> To get there you need to differentiate between a truncating division and a
>> non-truncating division etc.
>
> Well, it's not a small number of differences. Almost every operation is
> different. Maybe all of them. I can't actually think of a single operation where
> the semantics are the same for integers and floating point.
>
> Negation comes close, but even then you have the special cases -0.0 and
> -(-int.max - 1).
>
>
>> The philosophy behind generic programming and the requirements for efficient
>> generic programming is quite different from the the machine-level hand
>> optimizing philosophy of classic C, IMO.
>
> I think that unfortunately, it's a quest that is doomed to fail. Producing
> generic code that works for both floats and ints is a fool's errand.

I quoted you on https://github.com/D-Programming-Language/phobos/pull/2366 !

July 23, 2014

Re: Integer overflow and underflow semantics?

Posted by Ola Fosheim Grøstad
in reply to Don

Ola Fosheim Grøstad

Posted in reply to Don

On Wednesday, 23 July 2014 at 07:49:28 UTC, Don wrote:
> I think it's a complete fantasy to think you can write generic code that will work for both floats and ints. The algorithms are completely different.

Not really a valid line of reasoning.

Bool < uints < ints < fixed point < floats < interval arithmetic

You can make the same argument about all these types. Moreover, any float can be accurately represented as a rational number.

> (f + 1) and  (i +  1)  have totally different semantics.

Not if you view floats as a single sample on a real interval. You can compute this interval on CT and sample a float on it.

If you are speaking of iterative methods, sure, it might not converge. But that us not unique for floats, happens with ints vs uints too.

> Well, it's not a small number of differences. Almost every operation is different. Maybe all of them. I can't actually think of a single operation where the semantics are the same for integers and floating point.

Double can emulate 32 bit ints. Fixed point is essentially subnormal floats with limited exponent. Fixed point IS integer math. All int types are fixed point. If you find a clean way to support transaparent use of fixed point, you probably also resolve the issues with floats.

> I think that unfortunately, it's a quest that is doomed to fail. Producing generic code that works for both floats and ints is a fool's errand.

Of course not. Not if the semantic analysis deals with precision and value ranges. Not trivial, but not impossible either.

July 24, 2014

Re: Integer overflow and underflow semantics?

Posted by Artur Skawina
in reply to Don

Artur Skawina

Posted in reply to Don

On 07/23/14 09:16, Don via Digitalmars-d wrote:
> On Monday, 21 July 2014 at 21:10:43 UTC, Artur Skawina via Digitalmars-d wrote:

>> Actually, C/C++ could get away with treating overflow during constant
>> folding as an error (or at least emitting a warning) because of the
>> lack of CTFE (and no templates in C's case). The code will either
>> compile or it won't.
>> For D that is not possible -- if an expression is valid at run-time
>> then it should be valid at compile-time
> 
> Why do you think that? There are many cases where that is not true. Comparing pointers to two unrelated objects will work at runtime, but causes an error in CTFE. You can read global variables at runtime, not in CTFE. Etc.

Obviously, any allowed operation must still yield a meaningful result. The CTFE restrictions you list could actually be relaxed. Eg __gshared object pointers could be (more) exposed; static immutable objects already are exposed, it would be possible to give read-only access to mutable ones too. (Not that I'm suggesting doing that)

But the context was _integer arithmetic_ expressions -- those needs to
work at CT exactly like they do at RT. Anything else would mean that
CTFE would be crippled; either some (sub-)programs wouldn't be usable
at CT, or they would give different results. A compromise that would
disallow just some "obviously" wrong expressions is not a real option,
because D is too powerful (the "wrong" code could itself come from CTFE).

Sure, it would be /technically/ possible to trap on overflow at CT, but the result wouldn't be sane. And it would make the language even more complex. Just imagine all the extra posts asking why something doesn't work at CT. And if OF is allowed in certain situations, then also all the complaints about the compiler not catching some overflow...

artur

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation