January 07, 2012
On 1/7/12 10:10 AM, Don wrote:
> Sorry Andrei, I have to disagree with that in the strongest possible
> terms. I would have mentioned AAs as a very strong argument in the
> opposite direction!
>
> Moving AAs from a built-in to a library type has been an unmitigated
> disaster from the implementation side. And it has so far brought us
> *nothing* in return. Not "hardly anything", but *NOTHING*.

It would be premature to conclude from this as the conversion is incomplete.

> I don't even
> have any idea of what good could possibly come from it.

Using static calls for hashing and comparisons instead of indirect calls comes to mind.


Andrei
January 07, 2012
On 1/7/12 10:19 AM, Adam D. Ruppe wrote:
> On Saturday, 7 January 2012 at 16:10:32 UTC, Don wrote:
>> Sorry Andrei, I have to disagree with that in the strongest possible
>> terms. I would have mentioned AAs as a very strong argument in the
>> opposite direction!
>
> Amen. AAs are *still* broken from this change.

Well they are broken because the change has not been carried to completion.

I think that baking AAs in the compiler is poor programming language design. There are absolutely no ifs and buts about it, and the matter is obvious enough to me and sufficiently internalized to cause me difficulty to argue it.


Andrei
January 07, 2012
On Saturday, 7 January 2012 at 16:55:00 UTC, Andrei Alexandrescu wrote:
> Well they are broken because the change has not been carried to completion.

Here's my position: if we get a library implementation that
works better than the compiler implementation, let's do it. If they
are equal in use or only worse from minor syntax or some other trivial
matter, let's do library since that's nicer indeed.

But, if the library one doesn't work as well as the compiler
implementation, whether due to design, bugs, or any other practical
consideration, let's not break things until the library impl
catches up.

If that's going to take several years, we have to consider the
benefit of having it now rather than later too.
January 07, 2012
On 01/07/12 17:10, Don wrote:
> On 07.01.2012 04:18, Andrei Alexandrescu wrote:
> Also consider how the hard-coding of associative arrays in
>> an awkward interface inside the runtime has stifled efficient implementations, progress, and innovation in that area. Still a lot of work needed there, too, to essentially undo a bad decision.
> 
> Sorry Andrei, I have to disagree with that in the strongest possible terms. I would have mentioned AAs as a very strong argument in the opposite direction!
> 
> Moving AAs from a built-in to a library type has been an unmitigated disaster from the implementation side. And it has so far brought us *nothing* in return. Not "hardly anything", but *NOTHING*. I don't even have any idea of what good could possibly come from it. Note that you CANNOT have multiple implementations on a given platform, or you'll get linker errors! So I think there is more pain to come from it.
> It seems to have been motivated by religious reasons and nothing more.
> Why should anyone believe the same argument again?
> 

Reminded me of this: "static immutable string[string] aa = [ "a": "b" ];" isn't currently possible (AA literals are non-const expressions); could this work w/o compiler support?..

artur
January 07, 2012
On 1/7/2012 8:10 AM, Don wrote:
> Moving AAs from a built-in to a library type has been an unmitigated disaster
> from the implementation side. And it has so far brought us *nothing* in return.
> Not "hardly anything", but *NOTHING*. I don't even have any idea of what good
> could possibly come from it. Note that you CANNOT have multiple implementations
> on a given platform, or you'll get linker errors! So I think there is more pain
> to come from it.
> It seems to have been motivated by religious reasons and nothing more.
> Why should anyone believe the same argument again?


Having a pluggable interface so the implementation can be changed is all right, as long as the binary API does not change.

If the binary API changes, then of course, two different libraries cannot be linked together. I strongly oppose any changes which would lead to a balkanization of D libraries.

(Consider the disaster C++ has had forever with everyone inventing their own string type. That insured zero interoperability between C++ libraries, a situation that persists even for 10 years after C++ finally acquired a standard string library.)
January 07, 2012
On 1/7/12 12:48 PM, Walter Bright wrote:
> On 1/7/2012 8:10 AM, Don wrote:
>> Moving AAs from a built-in to a library type has been an unmitigated
>> disaster
>> from the implementation side. And it has so far brought us *nothing*
>> in return.
>> Not "hardly anything", but *NOTHING*. I don't even have any idea of
>> what good
>> could possibly come from it. Note that you CANNOT have multiple
>> implementations
>> on a given platform, or you'll get linker errors! So I think there is
>> more pain
>> to come from it.
>> It seems to have been motivated by religious reasons and nothing more.
>> Why should anyone believe the same argument again?
>
>
> Having a pluggable interface so the implementation can be changed is all
> right, as long as the binary API does not change.
> If the binary API changes, then of course, two different libraries
> cannot be linked together. I strongly oppose any changes which would
> lead to a balkanization of D libraries.

In my opinion this statement is thoroughly wrong and backwards. I also think it reflects a misunderstanding of what my stance is. Allow me to clarify how I see the situation.

Currently built-in hash table use generates special-cased calls to non-template functions implemented surreptitiously in druntime. The underlying theory, also sustained by the statement quoted above, is that we are interested in supporting linking together object files and libraries BUILT WITH DISTINCT MAJOR RELEASES OF DRUNTIME.

There is zero interest for that. ZERO. No language even attempts to do so. Runtimes that are not compatible with their previous versions are common, frequent, and well understood as an issue.

In an ideal world, built-in hash tables should work in a very simple manner. The compiler lowers all special hashtable syntax - in a manner that's MINIMAL, SIMPLE, and CLEAR - into D code that resolves to use of object.di (not some random user-defined library!). From then on, druntime code takes over. It could choose to use templates, dynamic type info, whatever. It's NOT the concern of the compiler. The compiler has NO BUSINESS taking library code and hardwiring it in for no good reason.

This setup allows static and dynamic linking of libraries, as long as the runtimes they were built with are compatible. This is expected, by design, and a good thing.

> (Consider the disaster C++ has had forever with everyone inventing their
> own string type. That insured zero interoperability between C++
> libraries, a situation that persists even for 10 years after C++ finally
> acquired a standard string library.)

It is exactly this kind of canned statement and prejudice that we must avoid. It unfairly singles out C++ when there also exist incompatible libraries in C, Java, Python, you name it.

Also, the last time the claim that everywhere invented their own string type could have been credibly aired was around 2004.

What's built inside the compiler is like axioms in math, and what's library is like theorems supported by the axioms. A good language, just like a good mathematical system, has few axioms and many theorems. That means the system is coherent and expressive. Hardwiring stuff in the language definition is almost always a failure of the expressive power of the language. Sometimes it's fine to just admit it and hardwire inside the compiler e.g. the prior knowledge that "+" on int does modulo addition. But most always it's NOT, and definitely not in the context of a complex data structure like a hash table. I also think that adding a hecatomb of built-in types and functions has smells, though to a good extent I concede to the necessity of it.

We should start from what the user wants to accomplish. Then figure how to express that within the language. And only lastly, when needed, change the language to mandate lowering constructs to the MINIMUM EXTENT POSSIBLE into constructs that can be handled within the existing language. This approach has been immensely successful virtually whenever we applied it: foreach for ranges (though there's work left to do there), operator overloading, and too little with hashes. Lately I see a sort of getting lazy and skipping the second pass entirely. Need something? Yeah, what the hell, we'll put it in the language.

I am a bit worried about the increasing radicalization of the discussion here, but recent statements come in frontal collision with my core principles, which I think stand on solid evidential ground. I am appealing for building consensus and staying principled instead of reaching for the cheap solution. If we do the latter, it's quite likely we'll regret it later.


Andrei
January 08, 2012
On 1/7/2012 1:28 PM, Andrei Alexandrescu wrote:
>> Having a pluggable interface so the implementation can be changed is all
>> right, as long as the binary API does not change.
>> If the binary API changes, then of course, two different libraries
>> cannot be linked together. I strongly oppose any changes which would
>> lead to a balkanization of D libraries.
>
> In my opinion this statement is thoroughly wrong and backwards. I also think it
> reflects a misunderstanding of what my stance is. Allow me to clarify how I see
> the situation.
>
> Currently built-in hash table use generates special-cased calls to non-template
> functions implemented surreptitiously in druntime. The underlying theory, also
> sustained by the statement quoted above, is that we are interested in supporting
> linking together object files and libraries BUILT WITH DISTINCT MAJOR RELEASES
> OF DRUNTIME.
>
> There is zero interest for that. ZERO. No language even attempts to do so.
> Runtimes that are not compatible with their previous versions are common,
> frequent, and well understood as an issue.

We've agree on this before, perhaps I misstated it here, but I am not talking about changing druntime. I'm talking about someone providing their own hash table implementation that has a different binary API than the one in druntime, such that code from their library cannot be linked with any other code that uses the regular hashtable.

A different implementation of hashtable would be fine, as long as it is binary compatible. We did this when we switched from a binary tree collision resolution to a linear one, and the switchover went without a hitch because it did not require even a recompile of existing binaries.


> In an ideal world, built-in hash tables should work in a very simple manner. The
> compiler lowers all special hashtable syntax - in a manner that's MINIMAL,
> SIMPLE, and CLEAR - into D code that resolves to use of object.di (not some
> random user-defined library!). From then on, druntime code takes over. It could
> choose to use templates, dynamic type info, whatever. It's NOT the concern of
> the compiler. The compiler has NO BUSINESS taking library code and hardwiring it
> in for no good reason.

That was already true of the hashtables - it's just that the interface to them was through a set of fixed function calls, rather than a template interface. To the compiler, the hashtables were a completely opaque void*. The compiler had zero knowledge of how they actually were implemented inside the runtime.

Changing it to a template implementation enables a more efficient interface, as inlining, etc., can be done instead of the slow opApply() interface. The downside of that is it becomes a bit perilous, as the binary API is not so flexible anymore.


>> (Consider the disaster C++ has had forever with everyone inventing their
>> own string type. That insured zero interoperability between C++
>> libraries, a situation that persists even for 10 years after C++ finally
>> acquired a standard string library.)
>
> It is exactly this kind of canned statement and prejudice that we must avoid. It
> unfairly singles out C++ when there also exist incompatible libraries in C,
> Java, Python, you name it.

Of course, but strings are a fundamental data type, and so it was worse with C++. I don't agree that my opinion on it is prejudicial or unfair, because I many times was stuck with having to deal with the issues of trying to glue together disparate code that had differing string classes. Often, it was the only incompatibility, but it permeated the library interfaces.

> Also, the last time the claim that everywhere invented their own string type
> could have been credibly aired was around 2004.

Sure, people rarely (never?) do their own C++ string classes anymore, but that old code and those old libraries are still around, and are actively maintained.

http://msdn.microsoft.com/en-us/library/ms174288.aspx

Notice that's for Visual Studio C++ 2010.

The string problem was a mistake I was determined not to make with D.

I have agreed with you and still agree with the notion of using lowering instead of custom code. Also, keep in mind that the hashtable design was done long before D even had templates. It was "lowered" to what D had at the time - function calls and opApply.



> What's built inside the compiler is like axioms in math, and what's library is
> like theorems supported by the axioms. A good language, just like a good
> mathematical system, has few axioms and many theorems. That means the system is
> coherent and expressive. Hardwiring stuff in the language definition is almost
> always a failure of the expressive power of the language.

True.

> Sometimes it's fine to
> just admit it and hardwire inside the compiler e.g. the prior knowledge that "+"
> on int does modulo addition.

Right, I understand that the abstraction abilities of D are not good enough to produce a credible 'int' type, or 'float', etc., hence they are wired in.

> But most always it's NOT, and definitely not in the
> context of a complex data structure like a hash table. I also think that adding
> a hecatomb of built-in types and functions has smells, though to a good extent I
> concede to the necessity of it.

I want to reiterate that I don't think there is a way with the current compiler technology to make a library SIMD type that will perform as well as a builtin one, and those who use SIMD tend to be extremely demanding of performance.

(One could make a semantic equivalent, but not a performance equivalent.)


> We should start from what the user wants to accomplish. Then figure how to
> express that within the language. And only lastly, when needed, change the
> language to mandate lowering constructs to the MINIMUM EXTENT POSSIBLE into
> constructs that can be handled within the existing language. This approach has
> been immensely successful virtually whenever we applied it: foreach for ranges
> (though there's work left to do there), operator overloading, and too little
> with hashes. Lately I see a sort of getting lazy and skipping the second pass
> entirely. Need something? Yeah, what the hell, we'll put it in the language.

I don't think that is entirely fair in regards to the SIMD stuff. It reminds me of after I spent a couple years at Caltech, where every class was essentially a math class. My sister asked me for help with her high school trig homework, and I just glanced at it and wrote down all the answers. She said she was supposed to show the steps involved, but to me I was so used to doing it there was only one step.

So while it may seem I'm skipping steps with the SIMD, I have been thinking about it for years off and on, and I have a fair experience with what needs to be done to generate good code.



>
> I am a bit worried about the increasing radicalization of the discussion here,
> but recent statements come in frontal collision with my core principles, which I
> think stand on solid evidential ground. I am appealing for building consensus
> and staying principled instead of reaching for the cheap solution. If we do the
> latter, it's quite likely we'll regret it later.
>
>
> Andrei

January 08, 2012
Walter:

> I don't think that is entirely fair in regards to the SIMD stuff. It reminds me of after I spent a couple years at Caltech, where every class was essentially a math class.

I think that in several (but not all) fields of science and technology your limits are often determined by how much (in depth and especially in how much variety) mathematics you know :-) Unfortunately lot of people don't seem much able or willing to learn it...

Bye,
bearophile
January 08, 2012
On 7/01/12 9:28 PM, Andrei Alexandrescu wrote:
> What's built inside the compiler is like axioms in math, and what's
> library is like theorems supported by the axioms. A good language, just
> like a good mathematical system, has few axioms and many theorems. That
> means the system is coherent and expressive. Hardwiring stuff in the
> language definition is almost always a failure of the expressive power
> of the language.

Yes, but when it comes to register allocation and platform specific instruction selection, that really is the job of the compiler. It is not something that can be done in a library (without rewriting the compiler in the language, which defeats the purpose of having a language in the first place).

I agree that the language should add the minimum number of features to support what we want, although in this case (due to how platform-specific the solutions are) I think it simply requires a lot of work in the compiler.


> We should start from what the user wants to accomplish. Then figure how
> to express that within the language. And only lastly, when needed,
> change the language to mandate lowering constructs to the MINIMUM EXTENT
> POSSIBLE into constructs that can be handled within the existing
> language.

I agree.

Essentially, we need at least:

- Some type (or types) that map directly to SIMD registers.
- The type must be separate from static arrays (aligned or not).
- Automatic register allocation, just like other primitive types.
- Automatic instruction scheduling.
- Ability to specify what instructions to use.

I agree with Manu that we should just have a single type like __m128 in MSVC. The other types and their conversions should be solvable in a library with something like strong typedefs.

As the *sole* reason for this enhancement is performance, the compiler absolutely must have all the information it needs to produce optimal code.


> I am a bit worried about the increasing radicalization of the discussion
> here, but recent statements come in frontal collision with my core
> principles, which I think stand on solid evidential ground. I am
> appealing for building consensus and staying principled instead of
> reaching for the cheap solution. If we do the latter, it's quite likely
> we'll regret it later.

We also need to be pragmatic. There is no point defining a perfect, modular, clean solution to the problem if it is going to take years to realize. In years, the problem may not exist anymore. This is especially true when it comes to hardware issues like the one we are discussing here.


January 08, 2012
On 8/01/12 12:14 AM, Walter Bright wrote:
> On 1/7/2012 1:28 PM, Andrei Alexandrescu wrote:
>> But most always it's NOT, and definitely not in the
>> context of a complex data structure like a hash table. I also think
>> that adding
>> a hecatomb of built-in types and functions has smells, though to a
>> good extent I
>> concede to the necessity of it.
>
> I want to reiterate that I don't think there is a way with the current
> compiler technology to make a library SIMD type that will perform as
> well as a builtin one, and those who use SIMD tend to be extremely
> demanding of performance.

Considering the that entire purpose of SIMD is performance, I think the demand is reasonable :-)