December 12, 2012
Hai :D I have seen your D program language on Google, it looks cool! how much different is it to the C program language?
December 12, 2012
> I implement, say, a Matrix class, then I should be able to tell the
> compiler that certain Matrix expressions, say A*B+A*C, can be factored
> into A*(B+C), and have the optimizer automatically do this for me based
> on what is defined in the type. Or specify that write("a");writeln("b");
> can be replaced by writeln("ab");. But I haven't come up with a good
> generic framework for actually making this implementable yet.

Yeah, it's not that easy; Nimrod uses a hygienic macro system with term rewriting rules and side effect analysis and alias analysis for that ;-).

http://build.nimrod-code.org/docs/trmacros.html

http://forum.nimrod-code.org/t/70

December 12, 2012
On Wednesday, 12 December 2012 at 00:43:39 UTC, H. S. Teoh wrote:
> On Wed, Dec 12, 2012 at 01:26:08AM +0100, foobar wrote:
>> On Wednesday, 12 December 2012 at 00:06:53 UTC, bearophile wrote:
>> >foobar:
>> >
>> >>I would enforce overflow and underflow checking semantics.<
>> >
>> >Plus one or two switches to disable such checking, if/when someone
>> >wants it, to regain the C performance. (Plus some syntax way to
>> >disable/enable such checking in a small piece of code).
>> >
>> >Maybe someday Walter will change his mind about this topic :-)
>
> I don't agree that compiler switches should change language semantics.
> Just because you specify a certain compiler switch, it can cause
> unrelated breakage in some obscure library somewhere, that assumes
> modular arithmetic with C/C++ semantics. And this breakage will in all
> likelihood go *unnoticed* until your software is running on the
> customer's site and then it crashes horribly. And good luck debugging
> that, because the breakage can be very subtle, plus it's *not* in your
> own code, but in some obscure library code that you're not familiar
> with.
>
> I think a much better approach is to introduce a new type (or new types)
> that *does* have the requisite bounds checking and static analysis.
> That's what a type system is for.
>
>
> [...]
>> Yeah, of course, that's why I said the C# semantics are _way_
>> better. (That's a self quote)
>> 
>> btw, here's the link for SML which does not use tagged ints -
>> http://www.standardml.org/Basis/word.html#Word8:STR:SPEC
>> 
>> "Instances of the signature WORD provide a type of unsigned integer
>> with modular arithmetic and logical operations and conversion
>> operations. They are also meant to give efficient access to the
>> primitive machine word types of the underlying hardware, and support
>> bit-level operations on integers. They are not meant to be a
>> ``larger'' int. "
>
> It's kinda too late for D to rename int to word, say, but it's not too
> late to introduce a new checked int type, say 'number' or something like
> that (you can probably think of a better name).
>
> In fact, Andrei describes a CheckedInt type that uses operator
> overloading, etc., to implement an in-library solution to bounds checks.
> You can probably expand that into a workable lightweight int
> replacement. By wrapping an int in a struct with custom operators, you
> can pretty much have an int-sized type (with value semantics, just like
> "native" ints, no less!) that does what you want, instead of the usual
> C/C++ int semantics.
>
>
> T

I didn't say D should change the implementation of integers, in fact I said the exact opposite - that it's probably to late to change the semantics for D. Had D was designed from scratch than yes, I would have advocated for a different design, either the C# one or as you suggest go even further and have two distinct types (as in SML) which is even better. But by no means am I to suggest to change D semantics _now_. Sadly, it's likely to late and we can only try to paper it on top with additional library types. This isn't a perfect solutions since the compiler has builtin knowledge about int and does optimizations that will be lost with a library type.
December 12, 2012
On 12/12/2012 2:33 AM, foobar wrote:
> This isn't a perfect solutions
> since the compiler has builtin knowledge about int and does optimizations that
> will be lost with a library type.

See my reply to bearophile about that.
December 12, 2012
On Wednesday, 12 December 2012 at 00:51:19 UTC, Walter Bright wrote:
> On 12/11/2012 3:44 PM, foobar wrote:
>> Thanks for proving my point. after all , you are a C++ developer, aren't you? :)
>
> No, I'm an assembler programmer. I know how the machine works, and C, C++, and D map onto that, quite deliberately. It's one reason why D supports the vector types directly.
>
>
>> Seriously though, it _is_ a trick and a code smell.
>
> Not to me. There is no trick or "smell" to anyone familiar with how computers work.
>
>
>> I'm fully aware that computers used 2's complement. I'm also am aware of the
>> fact that the type has an "unsigned" label all over it. You see it right there
>> in that 'u' prefix of 'int'. An unsigned type should semantically entail _no
>> sign_ in its operations. You are calling a cat a dog and arguing that dogs barf?
>> Yeah, I completely agree with that notion, except, we are still talking about _a
>> cat_.
>
> Andrei and I have endlessly talked about this (he argued your side). The inevitable result is that signed and unsigned types *are* conflated in D, and have to be, otherwise many things stop working.
>
> For example, p[x]. What type is x?
>
> Integer signedness in D is not really a property of the data, it is only how one happens to interpret the data in a specific context.
>
>
>> To answer you question, yes, I would enforce overflow and underflow checking
>> semantics. Any negative result assigned to an unsigned type _is_ a logic error.
>> you can claim that:
>> uint a = -1;
>> is perfectly safe and has a well defined meaning (well, for C programmers that
>> is), but what about:
>> uint a = b - c;
>> what if that calculation results in a negative number? What should the compiler
>> do? well, there are _two_ equally possible solutions:
>> a. The overflow was intended as in the mask = -1 case; or
>> b. The overflow is a _bug_.
>>
>> The user should be made aware of this and should make the decision how to handle
>> this. This should _not_ be implicitly handled by the compiler and allow bugs go
>> unnoticed.
>>
>> I think C# solved this _way_ better than C/D.
>
> C# has overflow checking off by default. It is enabled by either using a checked { } block, or with a compiler switch. I don't see that as "solving" the issue in any elegant or natural way, it's more of a clumsy hack.
>
> But also consider that C# does not allow pointer arithmetic, or array slicing. Both of these rely on wraparound 2's complement arithmetic.
>
>
>> Another data point would be (S)ML
>> which is a compiled language which requires _explicit conversions_ and has a
>> very strong typing system. Its programs are compiled to efficient native
>> executables and the strong typing allows both the compiler and the programmer
>> better reasoning of the code. Thus programs are more correct and can be
>> optimized by the compiler. In fact, several languages are implemented in ML
>> because of its higher guaranties.
>
> ML has been around for 30-40 years, and has failed to catch on.

This is precisely the point that signed and unsigned types are conflated *in D*.
Other languages, namely ML chose a different design.
ML chose to have two distinct types: word and int, word is for binary data and int for integer numbers. Words provide efficient access to the machine representation and have no overflow checks, ints represent numbers and do carry overflow checks. you can convert between the two and the compiler/run-time can carry special knowledge about such conversions in order to provide better optimization.
in ML, array indexing is done with an int since it _is_ conceptually a number.

Btw, SML was standardized in '97. I'll also dispute the claim that it hasn't caught on - there are many derived languages from it and it is just as large if not larger than the C family of languages. It has influenced many languages and it and its derivations are being used. One example that comes to mind is the future version of JavaScript is implemented in ML. So no, not forgotten but rather alive and kicking.

December 12, 2012
On Wednesday, 12 December 2012 at 10:35:26 UTC, Walter Bright wrote:
> On 12/12/2012 2:33 AM, foobar wrote:
>> This isn't a perfect solutions
>> since the compiler has builtin knowledge about int and does optimizations that
>> will be lost with a library type.
>
> See my reply to bearophile about that.

Yeah, just saw that :)
So basically you're suggesting to implement Integer and Word library types using compiler intrinsics as a way to migrate to better ML compatible semantics. This is a possible solution if it can be proven to work.

Regarding performance and overflow checking, the example you give is x86 specific. What about other platforms? For example ARM is very popular nowadays in the mobile world and there are many more smart-phones out there than there are PCs. Is the same issue exists and if not (I suspect not, but really have no idea) should D be geared towards current platforms or future ones?
December 12, 2012
foobar:

> So basically you're suggesting to implement Integer and Word library types using compiler intrinsics as a way to migrate to better ML compatible semantics.

I think there were no references to ML in that part of Walter answer.


> Regarding performance and overflow checking, the example you give is x86 specific. What about other platforms? For example ARM is very popular nowadays in the mobile world and there are many more smart-phones out there than there are PCs. Is the same issue exists and if not (I suspect not, but really have no idea) should D be geared towards current platforms or future ones?

Currently DMD (and a bit D too) is firmly oriented toward x86, with a moderate orientation toward 64 bit too. Manu has asked for more attention toward ARM, but (as Andrei has said) maybe finishing const/immutable/shared is better now.

Bye,
bearophile
December 12, 2012
> Arithmetic in computers is different from the math you learned in school. It's 2's complement, and it's best to always keep that in mind when writing programs.

From http://embed.cs.utah.edu/ioc/

" Examples of undefined integer overflows we have reported:

    An SQLite bug
    Some problems in SafeInt
    GNU MPC
    PHP
    Firefox
    GCC
    PostgreSQL
    LLVM
    Python

We also reported bugs to BIND and OpenSSL. Most of the SPEC CPU 2006 benchmarks contain undefined overflows."

So how does D improve on C's model? If signed integers are required to wrap around in D (no undefined behaviour), you also prevent some otherwise possible optimizations (there is a reason it's still undefined behaviour in C).
December 12, 2012
Araq:

> So how does D improve on C's model?

There is some range analysis on shorter integral values. But overall it shares the same troubles.


> If signed integers are required to wrap around in D (no undefined behaviour),

I think in D specs signed integers don't require the wrap-around (so it's undefined behaviour).

Bye,
bearophile
December 12, 2012
Machine/hardware have a explicitly defined register size and does know nothing about sign and data type. fastest operation is unsigned and fits to register size.

For example in your case, some algorithm that coded with chained-if-checks may come unusable because it will slow.

And about C# checked: http://msdn.microsoft.com/ru-ru/library/74b4xzyw.aspx
By default it is only for constants. For expressions in runtime it must be explicitly enabled.

I think this check must be handled by developer through library or compiler.