May 29, 2020
On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
> On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
>> On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
>>> [...]
>>
>> Those details would fill a few pages I am afraid.
>>
>> I'd like to talk about the what has to happen rather than how it happens.
>> So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
>
> I would love to see a blog on this.
>
> I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area.  Your recent posts in other threads are very exciting.

Any Questions or suggestions are appreciated.
May 29, 2020
On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:
> However:
> We have not covered how names are resolved in scopes;
> Nor the intricacies of how to pick the right overload for a template.
> And which implicit conversion happen for that;
> Neither have we talked about deferred resolution of templates or
> which steps semantic analysis has to take.

Don't forget to discuss template memoization. As I'm sure you're well aware, it has massive performance consequences, both good and bad, as well as being semantically essential to some uses of templates.

Greenspun's tenth rule of programming states: "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp." D very nearly fulfills this law (a tongue-in-cheek version of the "inner-platform effect") through its template system, except that it's probably not fair to call it "informally-specified".

I argue that the root causes of the terrible compile-time performance, and especially memory consumption, of D's template system are forced memoization of intermediate results, and the lack of the tail recursion optimization. Who ever heard of a pure functional language without tail recursion optimization, or *any* programming language with inescapable memoization?

There is no good reason to assign a long, permanent symbol to each and every partially sorted scrap of an AliasSeq that is generated in the midst of a compile-time sort, instead of just garbage collecting them. But my understanding is currently those intermediate results just stick around forever, massively raising the asymptotic memory complexity of otherwise benign algorithms.
May 29, 2020
On Friday, 29 May 2020 at 19:46:04 UTC, tsbockman wrote:
> On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:
>> However:
>> We have not covered how names are resolved in scopes;
>> Nor the intricacies of how to pick the right overload for a template.
>> And which implicit conversion happen for that;
>> Neither have we talked about deferred resolution of templates or
>> which steps semantic analysis has to take.
>
> Don't forget to discuss template memoization. As I'm sure you're well aware, it has massive performance consequences, both good and bad, as well as being semantically essential to some uses of templates.
>
> Greenspun's tenth rule of programming states: "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp." D very nearly fulfills this law (a tongue-in-cheek version of the "inner-platform effect") through its template system, except that it's probably not fair to call it "informally-specified".
>
> I argue that the root causes of the terrible compile-time performance, and especially memory consumption, of D's template system are forced memoization of intermediate results, and the lack of the tail recursion optimization. Who ever heard of a pure functional language without tail recursion optimization, or *any* programming language with inescapable memoization?
>
> There is no good reason to assign a long, permanent symbol to each and every partially sorted scrap of an AliasSeq that is generated in the midst of a compile-time sort, instead of just garbage collecting them. But my understanding is currently those intermediate results just stick around forever, massively raising the asymptotic memory complexity of otherwise benign algorithms.

Without memorization some things would be completely impractical to do.
Then again ignoring the implementation in dmd right now;
There is no fundamental reason why anything has to be cached in a purely functional system.
since computing it again should yield the same values.

As it stands the cache does only do good.
It is not the slow part.
Creating tons of ast nodes is however.

For practical reasons dmd does not free intermediate results, that would be the case
even if the template instances were not cached.
May 29, 2020
On Friday, 29 May 2020 at 19:17:26 UTC, Stefan Koch wrote:
> On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
>> On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
>>> On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
>>>> [...]
>>>
>>> Those details would fill a few pages I am afraid.
>>>
>>> I'd like to talk about the what has to happen rather than how it happens.
>>> So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
>>
>> I would love to see a blog on this.
>>
>> I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area.  Your recent posts in other threads are very exciting.
>
> Any Questions or suggestions are appreciated.

I!"boo" -> { alias I = "boo"; }.I?

Why do we still have to alias a = Alias!"boo"?
May 29, 2020
On Friday, 29 May 2020 at 21:18:42 UTC, Max Samukha wrote:
> On Friday, 29 May 2020 at 19:17:26 UTC, Stefan Koch wrote:
>> On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
>>> On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
>>>> On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
>>>>> [...]
>>>>
>>>> Those details would fill a few pages I am afraid.
>>>>
>>>> I'd like to talk about the what has to happen rather than how it happens.
>>>> So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
>>>
>>> I would love to see a blog on this.
>>>
>>> I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area.  Your recent posts in other threads are very exciting.
>>
>> Any Questions or suggestions are appreciated.
>
> I!"boo" -> { alias I = "boo"; }.I?
>
> Why do we still have to alias a = Alias!"boo"?

You are correct.
That is what will happen and the instance of I will alias to the string "boo".

I might be wrong. But I believe the reason we have to do this is a parser issue.
The parser expects a type after the alias {ident} = .
The template instance will parse as a type while the string "boo" does not.
May 29, 2020
On Friday, 29 May 2020 at 19:57:26 UTC, Stefan Koch wrote:
> Without memorization some things would be completely impractical to do.

Yes, but does that mean that *everything* needs to be memoized? In other programming languages (and that's all template D really is, a programming language), memoization is used as needed, not *all the time*.

> Then again ignoring the implementation in dmd right now;
> There is no fundamental reason why anything has to be cached in a purely functional system.
> since computing it again should yield the same values.
>
> As it stands the cache does only do good.
> It is not the slow part.
> Creating tons of ast nodes is however.

A single instantiation of std.meta.staticSort with a few hundred items requires well over 100 MB of memory which will not be freed until the compiler exits. It seems like it's not a problem only because it's impossible to write meta-programs that process non-trivial amounts of data, so people just don't.

(I would test with more items, but the compiler crashes due to hitting the template recursion limit - which is quite silly, given that the algorithm used, merge sort, only requires O(log(n)) non-tail levels in other programming languages.)

> For practical reasons dmd does not free intermediate results, that would be the case
> even if the template instances were not cached.

DMD does try to free intermediate results with the "-lowmem" option:

    https://dlang.org/dmd-linux.html#switch-lowmem

Of course, the usefulness of this option is severely limited due to the forced memoization.
May 29, 2020
On Friday, 29 May 2020 at 19:17:26 UTC, Stefan Koch wrote:
> On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
>> On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
>>> On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
>>>> [...]
>>>
>>> Those details would fill a few pages I am afraid.
>>>
>>> I'd like to talk about the what has to happen rather than how it happens.
>>> So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
>>
>> I would love to see a blog on this.
>>
>> I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area.  Your recent posts in other threads are very exciting.
>
> Any Questions or suggestions are appreciated.

Id also love to see a blog post, but more in depth. It was very easy to follow, just wanted more of it.

Maybe a few blog posts that lead into a talk for the virtual DConf later in the year? Compiler internals is one of the most interesting topics imo.

May 29, 2020
On Friday, 29 May 2020 at 21:38:20 UTC, tsbockman wrote:
> On Friday, 29 May 2020 at 19:57:26 UTC, Stefan Koch wrote:
>> [...]
>
> Yes, but does that mean that *everything* needs to be memoized? In other programming languages (and that's all template D really is, a programming language), memoization is used as needed, not *all the time*.
>
>> [...]
>
> A single instantiation of std.meta.staticSort with a few hundred items requires well over 100 MB of memory which will not be freed until the compiler exits. It seems like it's not a problem only because it's impossible to write meta-programs that process non-trivial amounts of data, so people just don't.
>
> (I would test with more items, but the compiler crashes due to hitting the template recursion limit - which is quite silly, given that the algorithm used, merge sort, only requires O(log(n)) non-tail levels in other programming languages.)
>
>> [...]
>
> DMD does try to free intermediate results with the "-lowmem" option:
>
>     https://dlang.org/dmd-linux.html#switch-lowmem
>
> Of course, the usefulness of this option is severely limited due to the forced memoization.

I haven't seen staticSort so far.
I shall have a look at it.

The weka fork of ldc has a dynamically configure-able recursion limit.
And dmd can be trivially patched to take it as a runtime option as well.
Just in case you do want to see what happens if you hit the _real_ limit ;)
May 29, 2020
On Friday, 29 May 2020 at 21:53:33 UTC, Stefan Koch wrote:
> I haven't seen staticSort so far.
> I shall have a look at it.

Your type function project, newCTFE, and the ability for the compiler to discard intermediate results in CTFE would, combined, enable meta algorithms that scale reasonably, I think. The ability to mark templates (or maybe specific instantiations?) @noMemo to disable memoization would be another approach.

Anyway, here's some test code I wrote:

import std.meta, std.random, std.stdio;

immutable(int[]) randomItems(int n) pure @trusted nothrow {
    assert(__ctfe);

    auto rand = Xorshift(n);
    int[] ret = new int[n];
    for(size_t x = 0; x < n; ++x) {
        ret[x] = rand.front;
        rand.popFront();
    }

    return cast(immutable) ret;
}

struct S(int _x) { enum int x = _x; }
enum int X(S) = S.x;
template less(A, B) { enum bool less = A.x < B.x; }
alias boom = staticSort!(less, staticMap!(S, aliasSeqOf!(randomItems(384))));

void main()
{
    writeln([ staticMap!(X, boom) ]);
}

> The weka fork of ldc has a dynamically configure-able recursion limit.
> And dmd can be trivially patched to take it as a runtime option as well.
> Just in case you do want to see what happens if you hit the _real_ limit ;)

I have actually hit the real limit by accident before; that's how I discovered that it's possible to crash my operating system just by burning too much memory in user space. It's also why I now have 32 GB of RAM on my workstation. :-)

For anyone else having trouble with out-of-memory related crashes on Linux, I recommend replacing dmd in your PATH with a script like this to prevent the OS from crashing:

    (using ulimit -v $MAX_MEMORY_IN_KB; /path/to/dmd $@)

May 29, 2020
On Friday, 29 May 2020 at 22:31:54 UTC, tsbockman wrote:
> For anyone else having trouble with out-of-memory related crashes on Linux, I recommend replacing dmd in your PATH with a script like this to prevent the OS from crashing:
>
>     (using ulimit -v $MAX_MEMORY_IN_KB; /path/to/dmd $@)

Oops, that should be:

#! /bin/bash
(ulimit -v $MAX_MEMORY_IN_KB; /path/to/dmd $@)

1 2
Next ›   Last »