December 12, 2012
On Wed, Dec 12, 2012 at 02:15:24AM +0100, bearophile wrote:
> H. S. Teoh:
> 
> >Just because you specify a certain compiler switch, it can cause unrelated breakage in some obscure library somewhere, that assumes modular arithmetic with C/C++ semantics.
> 
> The idea was about two switches, one for signed integrals, and the other for both signed and unsigned. But from other posts I guess Walter doesn't think this is a viable possibility.

Two switches is even worse than one. The problem is that existing code assumes certain kind of behaviours from int, uint, etc.. Such code may exist in common libraries imported by your code (directly or indrectly). Now you compile your code with a switch (or two switches) that modifies the behaviour of int, and things start to break. Even worse, if you only use the switches on certain critical source files, then you may end up with incompatible behaviour of the same library code in the same executable (e.g. a template got instantiated once with the switches enabled, once without). It leads to all kinds of inconsistencies and subtle breakages that totally outweigh whatever benefits it may have had.


> So the solutions I see now are stop using D for some kind of more important programs, or using some kind of safeInt, and then work with the compiler writers to allow user-defined structs to be usable as naturally as possible as ints (and possibly efficiently).

It's not too late to add a new native type (or types) to the language that support this kind of checking. I see that as the best solution to this issue. Don't mess with the existing types, because too much already depends on it. Add a new type that has the desired behaviour.

But you may have a hard time convincing Walter to put it in, though.


> Regarding safeInt I think today there is no way to write it efficiently in D, because the overflow flags are not accessible from D, and if you use inlined asm, you lose inlining in DMD. This is just one of the problems. The other problems are syntax incompatibilities of user-defined structs compared to built-in ints. Other problems are the probable lack of high-level optimizations done on such user defined type.
[...]

These are implementation issues that we can work on improving. For one thing, I'd love to see D get closer to the point where the distinction between built-in types and user-defined types is gone. We may never actually reach that point, but the closer we get, the better. This will let us solve a lot of things, like drop-in replacements for AA's, etc., that are a bit ugly to do today.

One thing I've always thought about is a way for user-types to specify sub-expression optimizations that the compiler can apply. Basically, if I implement, say, a Matrix class, then I should be able to tell the compiler that certain Matrix expressions, say A*B+A*C, can be factored into A*(B+C), and have the optimizer automatically do this for me based on what is defined in the type. Or specify that write("a");writeln("b"); can be replaced by writeln("ab");. But I haven't come up with a good generic framework for actually making this implementable yet.


T

-- 
I don't trust computers, I've spent too long programming to think that they can get anything right. -- James Miller
December 12, 2012
On 12/11/2012 5:15 PM, bearophile wrote:
> Regarding safeInt I think today there is no way to write it efficiently in D,
> because the overflow flags are not accessible from D, and if you use inlined
> asm, you lose inlining in DMD. This is just one of the problems.

The way to deal with this is to examine the implementation of CheckedInt, and design a couple of compiler intrinsics to use in its implementation that will eliminate the asm code. (This is how the high level vector library Manu is implementing is done.)


> The other problems are syntax incompatibilities of user-defined structs compared to
> built-in ints.

This is not an issue.

> Other problems are the probable lack of high-level optimizations
> done on such user defined type.

Using intrinsics deals with this issue nicely, as the optimizer knows about them.

> We are very far from a good solution to such problems.

No, we are not.

The problem, as I see it, is nobody actually cares about this. Why would I say something so provocative? Because I've seen D programmers go to herculean lengths to get around problems they are having in the language. These efforts make a strong case that they need better language support (UDAs are a primo example of this). I see nobody bothering to write a CheckedInt type and seeing how far they can push it, even though writing such a type is not a significant challenge; it's a bread-and-butter job.

Also, as I said before, there is a SafeInt class in C++. So far as I can tell, nobody uses it.

Want to prove me wrong? Implement such a user defined type, and demonstrate user interest in it.

(Also note the HalfFloat class I implemented for Manu, as a demonstration of how a user defined type can implement a floating point type that is unknown to the compiler.)
December 12, 2012
On 12/11/2012 5:05 PM, bearophile wrote:
> Walter Bright:
>
>> ML has been around for 30-40 years, and has failed to catch on.
>
> OcaML, Haskell, F#, and so on are all languages derived more or less directly
> from ML, that share many of its ideas. Has Haskell caught on? :-)

Haskell is the language that everyone loves to learn and talk about, and few actually use.

And it's significantly slower than D, in unfixable ways.

December 12, 2012
Walter Bright:

> The way to deal with this is to examine the implementation of CheckedInt, and design a couple of compiler intrinsics to use in its implementation that will eliminate the asm code.

OK, good. I didn't think of this option.


> Using intrinsics deals with this issue nicely, as the optimizer knows about them.

OK.


> The problem, as I see it, is nobody actually cares about this.

Maybe you are right. I think I have never said there is a lot of people caring about this in D :-)

Bye,
bearophile
December 12, 2012
On Wed, Dec 12, 2012 at 8:14 AM, Walter Bright <newshound2@digitalmars.com>wrote:

> (This is how the high level vector library Manu is implementing is done.)
>

Greetings

Where can I learn more about this library Manu is developing?

regards
- Puneet


December 12, 2012
On Wednesday, 12 December 2012 at 04:42:57 UTC, d coder wrote:
> On Wed, Dec 12, 2012 at 8:14 AM, Walter Bright
> <newshound2@digitalmars.com>wrote:
>
>> (This is how the high level vector library Manu is implementing is done.)
>>
>
> Greetings
>
> Where can I learn more about this library Manu is developing?
>
> regards
> - Puneet

The code is at https://github.com/TurkeyMan/phobos

It doesn't have anything to do with checked integers, though - Walter was just using it as an example of an approach that we could also use with checked integers.
December 12, 2012
On 12/11/2012 8:42 PM, d coder wrote:
> Where can I learn more about this library Manu is developing?

Manu posts here, reply to him!

December 12, 2012
> The problem, as I see it, is nobody actually cares about this. Why would I say something so provocative? Because I've seen D programmers go to herculean lengths to get around problems they are having in the language. These efforts make a strong case that they need better language support (UDAs are a primo example of this). I see nobody bothering to write a CheckedInt type and seeing how far they can push it, even though writing such a type is not a significant challenge; it's a bread-and-butter job.

I disagree with the analysis. I do want overflow detection, yet I would not use a CheckedInt in D for the same reason I do not usually use one in C++: without compiler support, it is too expensive to detect overflow. In my C++ I have a lot of math to do, and I'm using C++ because it's faster than C# which I would otherwise prefer. Constantly checking for overflow without hardware support would kill most of the performance advantage, so I don't do it.

I do use "clipped conversion" though: e.g. ClippedConvert<short>(40000)==32767. I can afford the overhead in this case because I don't do type conversions as often as addition, bit shifts, etc.

The C# solution is not good enough either. C# throws exceptions on overflow, which is convenient but is bad for performance if it happens regularly; it can also make a debugger almost unusable. Some sort of mechanism that works like an exception, but faster, would probably be better. Consider:

result = a * b + c * d;

If a * b overflows, there is probably no point to executing c * d so it may as well jump straight to a handler; on the other hand, the exception mechanism is costly, especially if the debugger is hooked in and causes a context switch every single time it happens. So... I dunno. What's the best semantic for an overflow detector?
December 12, 2012
On 12/11/2012 9:51 PM, David Piepgrass wrote:
>> The problem, as I see it, is nobody actually cares about this. Why would I say
>> something so provocative? Because I've seen D programmers go to herculean
>> lengths to get around problems they are having in the language. These efforts
>> make a strong case that they need better language support (UDAs are a primo
>> example of this). I see nobody bothering to write a CheckedInt type and seeing
>> how far they can push it, even though writing such a type is not a significant
>> challenge; it's a bread-and-butter job.
>
> I disagree with the analysis. I do want overflow detection, yet I would not use
> a CheckedInt in D for the same reason I do not usually use one in C++: without
> compiler support, it is too expensive to detect overflow. In my C++ I have a lot
> of math to do, and I'm using C++ because it's faster than C# which I would
> otherwise prefer. Constantly checking for overflow without hardware support
> would kill most of the performance advantage, so I don't do it.

You're not going to get performance with overflow checking even with the best compiler support. For example, much arithmetic code is generated for the x86 using addressing mode instructions, like:

    LEA EAX,16[8*EBX][ECX]  for 16+8*b+c

The LEA instruction does no overflow checking. If you wanted it, the best code would be:

    MOV EAX,16
    IMUL EBX,8
    JO overflow
    ADD EAX,EBX
    JO overflow
    ADD EAX,ECX
    JO overflow

Which is considerably less efficient. (The LEA is designed to run in one cycle). Plus, often more registers are modified which impedes good register allocation.

This is why performance languages do not do overflow checking, and why C# only does it under duress. It is not a conspiracy of pig-headed language developers :-)


> I do use "clipped conversion" though: e.g. ClippedConvert<short>(40000)==32767.
> I can afford the overhead in this case because I don't do type conversions as
> often as addition, bit shifts, etc.

You can't have both performant code and overflow detection.


> The C# solution is not good enough either. C# throws exceptions on overflow,
> which is convenient but is bad for performance if it happens regularly; it can
> also make a debugger almost unusable. Some sort of mechanism that works like an
> exception, but faster, would probably be better. Consider:
>
> result = a * b + c * d;
>
> If a * b overflows, there is probably no point to executing c * d so it may as
> well jump straight to a handler; on the other hand, the exception mechanism is
> costly, especially if the debugger is hooked in and causes a context switch
> every single time it happens. So... I dunno. What's the best semantic for an
> overflow detector?

If you desire overflows to be programming errors, then you want an abort, not a thrown exception. I am perplexed by your desire to continue execution when overflows happen regularly.
December 12, 2012
David Piepgrass:

> I do want overflow detection, yet I would not use a CheckedInt in D for the same reason I do not usually use one in C++: without compiler support, it is too expensive to detect overflow.

Here I have listed several problems in a library-defined SafeInt, but Walter has expressed willingness to introduce intrinsics, to give some compiler support, so it's a start of a solution:

http://forum.dlang.org/thread/jhkbsghxjmdrxoxaevzm@forum.dlang.org

Bye,
bearophile