September 29, 2020
On Tuesday, 29 September 2020 at 03:14:34 UTC, Andrei Alexandrescu wrote:
> Or take the foreach statement. Through painful trial and error, somebody figured out all possible shapes of foreach, and defined `each` to support most:
>
> https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L904
>
> What should have been a simple forwarding problem took 190 lines that could be characterized as very involved. And mind you, it doesn't capture all cases because per https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L1073:
>
> // opApply with >2 parameters. count the delegate args.
> // only works if it is not templated (otherwise we cannot count the args)
>
> I know that stuff (which will probably end on my forehead) because I went and attempted to improve things a bit in https://github.com/dlang/phobos/pull/7638/files#diff-0d6463fc6f41c5fb774300832e3135f5R805, which attempts to simplify matters by reducing the foreach cases to seven shapes.
>
> To paraphrase Alan Perlis: "If your foreach has seven possible shapes, you probably missed some."
>
> Over time, things did get better. We became adept of things such as lowering, and we require precision in DIPs.

I think this is a mischaracterization of a problem. I'm not really a fan of how ranges are implemented, especially how UFCS is used to chain functions together. When they start to become 10+ lines of a chain it is almost impossible to profile such code; with tools. Let alone try to understand what is happening, it would require to know how each function is implemented and how it expands and ends up being executed. Some things are lazy, somethings aren't; eg requiring `.array` after a `.map`. This is where I feel there shouldn't even be a `each` to begin with, just use a for loop.

Now that's my opinion, I deal with low level code that needs to be optimized and those UFSC chains are likened to scripting languages that usually don't need to worry about optimization. For short chains I don't think it's that big of a deal, it's usually easy enough to figure out what is going on and not enough is happening for too much to go wrong.

> And mind you, it doesn't capture all cases because per https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L1073:
>
> // opApply with >2 parameters. count the delegate args.
> // only works if it is not templated (otherwise we cannot count the args)

If you use a template of `opApply()` you lose type inference (as you do with regular templates). It's a limitation of the feature. You'd have the same problem if you have an overload of `opApply()` with delegates that have the same number of arguments; it won't know which one to pick. If you want to support both you'd just need 1 other case, but honestly if you have a template `opApply`, that's just bad practice (due to losing type inference). It doesn't need to be supported. The only reason I've ever wanted a templated opApply() is for auto attribute inference, but that's not worth losing the auto type inference you get with foreach. There was a suggestion for adding a new feature, or rather altering the behavior to allow for both, but I don't think it was well received.


> I know that stuff (which will probably end on my forehead) because I went and attempted to improve things a bit in https://github.com/dlang/phobos/pull/7638/files#diff-0d6463fc6f41c5fb774300832e3135f5R805, which attempts to simplify matters by reducing the foreach cases to seven shapes.

You are going the wrong way about simplifying the code. Why not add an opApply() property to the internal container types (array, assoc array, etc...)? Shouldn't be that difficult. Cases 4, 5, 6, and 7 would collapse into one then.

Cases 1, 2, and 3 aren't really different ways of using foreach. They are just conveniences for ranges. That tuple expansion is odd though, doesn't seem to be documented and it's a feature that foreach doesn't have (unlike the second parameter for the index). Doesn't need to be there, but someone probably uses it somewhere now.

So really it's just 2 cases with an additional case to support an index counter. That fits the bill, case 1: foreach, case 2: phobos ranges.

It's not as bad as you make it out to be, for this case.
September 29, 2020
On Tuesday, 29 September 2020 at 09:07:39 UTC, claptrap wrote:
> How do you decide what is an essential language primitive? Or what is necessary syntactic sugar? I mean syntactic sugar almost by definition isnt necessary, it's just window dressing.

The for loop and the while loop is syntactic sugar in C-like languages.
So clearly syntactic sugar is more than window dressing.

The usual way to define a minimal imperative language is to have concepts like "block" and "restart block" (covers  both looping and continue) and conditionals.

You can essentially do away with most concepts until you have just have a closure with coroutine capabilities + one conditional instruction. Two such tiny imperative languages are Self and Beta. So you basically merge the concept of a block and an object as well.. A constructor/destructor pair is just a single coroutine that has suspended.

However the minimal core language tend to be an on-paper construct for non-research languages. Theoretical. The compiler will most certainly have many more concepts than the minimal language to get better compilation times.

September 29, 2020
On Tuesday, 29 September 2020 at 10:03:12 UTC, Ola Fosheim Grøstad wrote:
> conditional instruction. Two such tiny imperative languages are Self and Beta. So you basically merge the concept of a block and an object as well.. A constructor/destructor pair is just a single coroutine that has suspended.

That last sentence is hypothetical. You could construct a language that only consists of concurrent closures. Would it be slow? Only if the optimizer isn't advanced enough, but in reality compiler authors depend on heuristics (common programming patterns) so it is quite difficult to get such languages to perform well in practice.

Compilers are sadly not very good at high level optimization... Which is why languages ship with pre-canned constructs, libraries and runtimes...


September 29, 2020
On Tuesday, 29 September 2020 at 03:14:34 UTC, Andrei Alexandrescu wrote:
> On 9/28/20 6:46 PM, Bruce Carneal wrote:

[..]
>
> Years ago, I was in a panel with Simon Peyton-Jones, the creator of Haskell. Nicest guy around, not to mention an amazing researcher. This question was asked: "What is the most important principle of programming language design?"
>
> His answer was so powerful, it made me literally forget everybody else's answer, including my own. (And there were no slouches in that panel, save for me: Martin Odersky, Guido van Rossum, Bertrand Meyer.) He said the following:
>
> "The most important principle in a language design is to define a small core of essential primitives. All necessary syntactic sugar lowers to core constructs. Do everything else in libraries."
>
Sorry to not be completely convinced on the pertinence of that statement. For the language designer it might well be true, for the users of the language I'm not completely sold. There are two languages that imho fit that description very well, they are both very powerfull and extremely elegant, but boy do they suck when you just want to get shit done: Lisp and Forth. And (I know, one should not start a sentence with and) it had been mentioned often in this forum, it is this shear flexibility that are in the way of wider adoption as there are so many ways of doing things, but none of them completely right.
Your AA example is a good one. If it wasn't in the language, it would be the same issue with AA as in any other language that defines it in libraries: several different incompatible restricted implementations.

Just my uninformed 2 cents.
September 29, 2020
On Tuesday, 29 September 2020 at 12:02:21 UTC, Patrick Schluter wrote:
> Sorry to not be completely convinced on the pertinence of that statement. For the language designer it might well be true, for the users of the language I'm not completely sold. There are two languages that imho fit that description very well, they are both very powerfull and extremely elegant, but boy do they suck when you just want to get shit done: Lisp and Forth.

Most research languages try to fit the description, but they rarely reach production. But in essence you are right. The key problem isn't functional, but syntactical. When I wrote in Beta I could do everything as easily as in most imperative languages, but everything looked the same. So you end up looking a landscape of pebbles in various colours, but that makes the landscape less visible. Compare that to landscape with roads, lakes and mountains.

Template programming suffers from the same problem. You can create fancy semantics, but the syntax becomes very hard on the user. It leads to less legible code as code that represent different concepts don't stand out.

You can say the same about some C code bases.

So template programming is fun, but it will suck until there is a language that finds a good way to provide syntax extensions.

> Your AA example is a good one. If it wasn't in the language, it would be the same issue with AA as in any other language that defines it in libraries: several different incompatible restricted implementations.

There are at least two ways to generalize AA, you can either have a language mechanism for serialization or mandate that all types provide a hash function and identity comparison (equality). I don't see how having a compiler built in is qualitatively different from having a standard library implementation in the general case.

September 29, 2020
On Tuesday, 29 September 2020 at 03:14:34 UTC, Andrei Alexandrescu wrote:
I'd rather live without that DIP.
>
> So I'm interested in exploring reification mainly - overwhelmingly - because it already works in the existing language. To the extent it has difficulties, it's because of undue limitations in CTFE and introspection. And these are problems we must fix anyway. So it's all a virtuous circle. Bringing myself to write `reify` and `dereify` around my types is a very small price to pay for all that.

Alright mate,
Let's assume we had the minimal langauge.
Very small core.
All features are library.

You end up with LISP.
What's with optimization?
Optimization, just in case you don't know,
is nothing but a semantic invariant transform, which is based on proving that observed semantics are actually invariant.
Let's take:
int mul(
  int x,
  int y
) {
  int result = 0;
  for(int i = 0; i < x; i++)
  {
    result += y;
  }
  return result;
}

A sufficiently advanced optimizer should be able to most every calls to `mul` into an imul instruction.

If that is the case there is no need to burden the language with a * operator and you could make the parser much simpler.

and Indeed ldc with -O3 will generate the following sequence for the function:

	imull	%esi, %edi // y *= x
	xorl	%eax, %eax // result = 0
	testl	%esi, %esi // if (y)
	cmovgl	%edi, %eax // if (x > 0) result = x;
        retq               // return result;

There are a few superfluous checks, for example the loop entrance condition (x > 0)
So how does this compare if we write a function that only does * ?

int mul(
  int x,
  int y
) {
  return x*y;
}

The code we get, from the same compiler and the same compiler flags (-O3) is this:

	movl	%edi, %eax // result = x;
	imull	%esi, %eax // result *= y;
	retq               // return result;

Which removes one test, and one xor, furthermore one move becomes unconditional.
(That reduces code size and the load on the branch predictor.)


September 29, 2020
On Tuesday, 29 September 2020 at 03:14:34 UTC, Andrei Alexandrescu wrote:
> On 9/28/20 6:46 PM, Bruce Carneal wrote:
>> [...]
>
> Years ago, I was in a panel with Simon Peyton-Jones, the creator of Haskell. Nicest guy around, not to mention an amazing researcher. This question was asked: "What is the most important principle of programming language design?"
>
> His answer was so powerful, it made me literally forget everybody else's answer, including my own. (And there were no slouches in that panel, save for me: Martin Odersky, Guido van Rossum, Bertrand Meyer.) He said the following:
>
> "The most important principle in a language design is to define a small core of essential primitives. All necessary syntactic sugar lowers to core constructs. Do everything else in libraries."
>
> (He didn't use the actual term "lowering", which is familiar to our community, but rather something equivalent such as "reduces".)
>
> That kind of killed the panel, in a good way. Because most other questions on programming language design and implementation simply made his point shine quietly. Oh yes, if you had a small core and build on it you could do this easily. And that. And the other. In a wonderfully meta way, all questions got instantly lowered to simpler versions of themselves. I will never forget that experience.
>
> D breaks that principle in several places. It has had a cavalier attitude to using magic tricks in the compiler to get things going, at the expense of fuzzy corners and odd limit cases. Look at hashtables. Nobody can create an equivalent user-defined type, and worse, nobody knows exactly why. (I recall vaguely it has something to do, among many other things, with qualified keys that are statically-sized arrays. Hacks in the compiler make those work, but D's own type system rejects the equivalent code. So quite literally D's type system cannot verify its own capabilities.)
>
> Or take implicit conversions. They aren't fully documented, and the only way to figure things out is to read most of the 7193 lines of https://github.com/dlang/dmd/blob/master/src/dmd/mtype.d. That's not a small core with a little sugar on top.
>
> Or take the foreach statement. Through painful trial and error, somebody figured out all possible shapes of foreach, and defined `each` to support most:
>
> https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L904
>
> What should have been a simple forwarding problem took 190 lines that could be characterized as very involved. And mind you, it doesn't capture all cases because per https://github.com/dlang/phobos/blob/master/std/algorithm/iteration.d#L1073:
>
> // opApply with >2 parameters. count the delegate args.
> // only works if it is not templated (otherwise we cannot count the args)
>
> I know that stuff (which will probably end on my forehead) because I went and attempted to improve things a bit in https://github.com/dlang/phobos/pull/7638/files#diff-0d6463fc6f41c5fb774300832e3135f5R805, which attempts to simplify matters by reducing the foreach cases to seven shapes.
>
> To paraphrase Alan Perlis: "If your foreach has seven possible shapes, you probably missed some."
>
> Over time, things did get better. We became adept of things such as lowering, and we require precision in DIPs.
>
> I very strongly believe that this complexity, if unchecked, will kill the D language. It will sink under its own weight. We need a precise, clear definition of the language, we need a principled approach to defining and extending features. The only right way to extend D at this point is to remove the odd failure modes created by said cavalier approach to doing things. Why can't I access typeid during compilation, when I can access other pointers? Turns out typeid is a palimpsest that has been written over so many times, even Walter cannot understand it. He told me plainly to forget about trying to use typeid and write equivalent code from scratch instead. That code has achieved lifetime employment.
>
> And no compiler tricks. I am very, very opposed to Walter's penchant to reach into his bag of tricks whenever a difficult problem comes about. Stop messing with the language! My discussions with him on such matters are like trying to talk a gambler out of the casino.
>
> That is a long way to say I am overwhelmingly in favor of in-language solutions as opposed to yet another addition to the language. To me, if I have an in-language solution, it's game, set, and match. No need for language changes, thank you very much. An in-language solution doesn't only mean no addition is needed, but more importantly it means that the language has sufficient power to offer D coders means to solve that and many other problems.
>
> I almost skipped a heartbeat when Adam mentioned - mid-sentence! - just yet another little language addition to go with that DIP for interpolated strings. It would make things nicer. You know what? I think I'd rather live without that DIP.
>
> So I'm interested in exploring reification mainly - overwhelmingly - because it already works in the existing language. To the extent it has difficulties, it's because of undue limitations in CTFE and introspection. And these are problems we must fix anyway. So it's all a virtuous circle. Bringing myself to write `reify` and `dereify` around my types is a very small price to pay for all that.

My thanks for the effort apparent in your composition.  The main lesson I take from it is that language primitives should be chosen using as much light as we can bring to bear.

I believe that language additions can be used to rebase the language on a better set of primitives.  I believe that there is more to learn about meta programming and the language features that support it and that the D platform may be the best place to learn.  Finally, I believe that a postulated ideal of "a small set of primitives" will always be better than any actual set derived from experience on the frontier.

Perhaps because of your strong stand against additions generally, you've not taken a serious look at type functions in particular.  I say that you've not taken a serious look because I don't have another way to understand your writings on the reify/dereify design.

If you elect to investigate type functions, I would very much like to hear your opinion.  I see type functions as an "it just works", "rebasing" addition that lets us displace inappropriate use of templates now, and that may illuminate additional simplification opportunities with use.

If you do not elect to investigate type functions, I expect we'll next interact on a different topic.

Again, my thanks for the thoughtful composition.

September 29, 2020
On Tuesday, 29 September 2020 at 12:38:42 UTC, Stefan Koch wrote:
> [continuing]

So we had this:

> int mul(
>   int x,
>   int y
> ) {
>   int result = 0;
>   for(int i = 0; i < x; i++)
>   {
>     result += y;
>   }
>   return result;
> }


And the code generated was

	imull	%esi, %edi // y *= x
	xorl	%eax, %eax // result = 0
	testl	%esi, %esi // if (y)
	cmovgl	%edi, %eax // if (x > 0) result = x;
        retq               // return result;

On first glance these instructions seem superfluous
But we have to understand that the compiler had no way of knowing that we actually wanted to generate a mere multilply.

For example there is the loop entrance check.
Why is it there?

We can proof it's necessity quite easily.
- let's assume we did not have it.
In that case we are allowd to enter loop body even if (x == 0)
so we would execute result += y; once.
Which would cause mul(0, 32) to result in 32. And that would be a violation of the semantics.

So there's no way to get rid of the check.
And this code will always perform worse than an intrinsic 'mul' which the compiler knows about.
September 29, 2020
On 9/29/20 8:02 AM, Patrick Schluter wrote:
> Your AA example is a good one. If it wasn't in the language, it would be the same issue with AA as in any other language that defines it in libraries: several different incompatible restricted implementations.

AA is not that far from being a library type.

If you discount the magic casting that AAs can do (which isn't all that useful), there is one (yes ONE) feature that AAs do that a library type cannot. And that is an implicit adding of an element in certain cases. The reason AAs can do it is because the compiler calls a different opIndex (it's not called opIndex, but some extern(C) hook) when it's in this mode, but only for AAs.

In fact, AAs are mostly already a library type, thanks to a lot of incremental effort over the years to lower the implementation to templates. Except for this one thing.

-Steve
September 29, 2020
On Tuesday, 29 September 2020 at 10:03:12 UTC, Ola Fosheim Grøstad wrote:
> On Tuesday, 29 September 2020 at 09:07:39 UTC, claptrap wrote:
>> How do you decide what is an essential language primitive? Or what is necessary syntactic sugar? I mean syntactic sugar almost by definition isnt necessary, it's just window dressing.
>
> The for loop and the while loop is syntactic sugar in C-like languages.
> So clearly syntactic sugar is more than window dressing.

OK yeah bad analogy. My point is that if you say "only what syntactic sugar is necessary" it doesn't really help because what does "necessary" mean? If you take it literally you dont need for and while loops because you have if and goto. But pretty much every language has them, so obviously there's more too it.

its a great quote and you can see the wisdom in it but there's a lot lurking under the surface of the words "essential" and "necessary"