June 10, 2013
Very interesting stuff. Couple of thoughts:

1. Does the windows unittest build an executable with all unittests, as it used to? Then it should probably switch to the Posix approach (one executable per tested module).

2. It seems a clean cut of templates is emerging:

(a) For strictly type-related stuff, prefer templates.

(b) For all other compile-time work, prefer CTFE.

It used to be a matter of preference, but I think practical reasons suggest that straight CTFE should be the natural and recommended approach to doing a variety of things during compilation.

3. It seems a new tactical approach to writing libraries is emerging: in template-heavy libraries, most or all top-level imports should be pushed down into the templates using them. That way the cost of importing e.g. levenshteinDistance() if all you need is copy() is just the cost of parsing (small). Only with actual use of templates will the necessary modules be imported. I will discuss it along with a related enhancement request in a DIP.


Andrei

On 6/10/13 5:25 AM, Martin Nowak wrote:
> On 06/10/2013 08:29 AM, Don Clugston wrote:
>> Yeah, my point was that for N tests, we get O(N^^2) templates
>> instantiated.
> How is that?
>
>  > Wow! (How many of those are unique, rather than reusing an existing
> instantiation?)
>
> The number of instances is also a huge performance issue when searching
> for existing instantiations, because search is done linear and comparing
> two TemplateInstance uses the costly arrayObjectMatch on the template
> arguments. So in average instantiating the same template N times uses
> O(N^^2 / 2) comparisons.
> I once improved that by computing the mangling before the search and
> using it as a key.
> That brought down the compilation time of std.range with unittests from
> 10 to 2 seconds.
> There is at least one blocker though
> http://d.puremagic.com/issues/show_bug.cgi?id=7469.
>
>  > Error: out of memory
>  > This has happened many times before, and we dealt with it by reducing
> the number of modules we compiled into each object file. We once had 30
> modules per obj file. Then fifteen. Then five. But now we're at one,
> that workaround can no longer be used.
>
> I ran into some issue when preparing the phobos unittests for shared
> libraries.
> If you compile the whole library at once with unittests enabled it
> consumes ~6GB of RAM.
> In order to find a good partition I took up the idea of clustering (
> http://d.puremagic.com/issues/show_bug.cgi?id=9673#c2,
> https://gist.github.com/dawgfoto/5747405). Well, at least clustering by
> imports did only find a few useful partitions because almost every
> module ends up importing std.range or std.algorithm. It might be
> interesting to do some analysis for the template instances used in each
> module.
>
> _______________________________________________
> dmd-internals mailing list
> dmd-internals@puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 10, 2013
On 6/9/2013 11:29 PM, Don Clugston wrote:
>
>
> I'm not sure. That's the number of calls to the constructor of TemplateInstance. I don't understand the code well enough to know
> if it can eventually gets merged with an existing TemplateInstance.
> If so, then perhaps there's something we could do to prevent them from getting created in the first place if they are duplicates.
>

The template instantiation code is short-circuited if the instantiation already exists. (It's a bad bug if this is broken.)
_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 10, 2013
On Monday, June 10, 2013 12:15:58 Don Clugston wrote:
> If we could split those things out into a 'leaf' module, probably most of the import graph complexity would disappear. I'm sure most of the cyclic imports are false dependencies.

The possibility of splitting out all traits (not just those in std.traits) into a separate module hierarchy has been brought up and discussed to some extent, but we haven't really decided on anything yet. But putting all of the traits in a separate hierarchy would allow us to hugely reduce the dependency hierarchy, because in a lot of cases, they're the reason that a module pulls in as many dependencies as it does, and when the dependency is for a trait used in a template constraint, the import can't be localized.

So, if we want to reduce module interdependence, we really probably should look at splitting out the traits as well as possibly some key items that we know most everything ends up needing (like the range primitives for arrays or the formatting and conversion stuff).

- Jonathan M Davis
_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 10, 2013
On 6/10/2013 2:25 AM, Martin Nowak wrote:
>
>
> The number of instances is also a huge performance issue when searching for existing instantiations, because search is done linear and comparing two TemplateInstance uses the costly arrayObjectMatch on the template arguments. So in average instantiating the same template N times uses O(N^^2 / 2) comparisons.

One idea to deal with that is to find a way to order the list, so searching could be binary search time instead of linear.
_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 11, 2013
Can't we solve this by exploding the modules into packages and only importing the sub-modules?  This fixes the ball-of-mud problem.

If we break up std.algorithm we break up the memory consumption...
 although IIRC the huge number of templates comes from exhaustive
instantiation in several test cases.


On Tue, Jun 11, 2013 at 6:01 AM, Walter Bright <walter@digitalmars.com>wrote:

>
> On 6/10/2013 2:25 AM, Martin Nowak wrote:
>
>>
>>
>> The number of instances is also a huge performance issue when searching for existing instantiations, because search is done linear and comparing two TemplateInstance uses the costly arrayObjectMatch on the template arguments. So in average instantiating the same template N times uses O(N^^2 / 2) comparisons.
>>
>
> One idea to deal with that is to find a way to order the list, so searching could be binary search time instead of linear.
>
> ______________________________**_________________
> dmd-internals mailing list
> dmd-internals@puremagic.com
> http://lists.puremagic.com/**mailman/listinfo/dmd-internals<http://lists.puremagic.com/mailman/listinfo/dmd-internals>
>


June 10, 2013
Don -- did you try to patch the code with https://github.com/D-Programming-Language/phobos/pull/1339?

Thanks,

Andrei

On 6/9/13 4:32 PM, Don Clugston wrote:
>  > Well, impossible is a rather strong word, considering that the win32
> auto-tester seems to be doing it's job successfully.
>
> It is impossible on my system. It will soon be impossible on all systems.
>  From what I've seen, the autotester is not "doing its job
> successfully". It's been failing intermittently over the past week. I
> believe it is right on the edge, right now. It only takes a very small
> change to the compiler to push it over.
> Depending on the system you're on, it can go off the cliff a bit
> earlier, but it's still inevitable.
> There's nothing new, in the sense that it's been getting steadily worse
> for years. We've to split it the Win32 build, multiple times. That
> solution no longer works.
>
>  > There's approximately 10M of source for druntime+phobos and some how
> it can't fit in a couple _gigs_ of ram?  And looking at std.algorithm,
> that's a mere 342k of source code.  Yes, it grows, but 4 orders of
> magnitude?
>
>
> Yeah. 4 orders of magnitude. Our codebase at sociomantic is a bit larger
> than Phobos + druntime, but it compiles in just a few seconds.
> The problem is, that because of templates, the memory consumption isn't
> linear with source size.
>
> dmd -unittest -o- std/algorithm
>
> instantiates 344150 templates. Yes 344K. More than a third of a million.
> More than the number of lines of source in the module.
> And yet there are only 1305 asserts in that module -- the tests are not
> particularly comprehensive.
>
> Improved memory management would not change the number of template
> instantiations that happen, or even how much memory is used, it would
> just change how good it is at collecting and reclaiming the memory.
>
> The irony with this is, I've run into this problem while trying to
> reduce the memory usage of CTFE.
>
> And I'm stuck. The autotester reports passes on all systems except
> win32. On Win32,  after five minutes, it reports:
>
> unittest
> std.exception.ErrnoException@std\stdio.d(1390):  (No error)
> ----------------
> 0x004AA84F in pure @safe int std.exception.errnoEnforce!(int,"std\stdio.d", 1390u).errnoEnforce(int, lazy immutable(char)[])
> 0x00A9E8DE in void std.stdio.File.writeln!(immutable(char)[], immutable(uint)).writeln(immutable(char)[], immutable(uint))
> 0x00AB8918 in void std.parallelism.__modtest()
> 0x00B064BB in int rt.minfo.moduleinfos_apply(scope int delegate(ref object.ModuleInfo*)).int __foreachbody555(ref rt.sections_win32.SectionGroup)
> 0x00B03665 in _d_run_main
> 0x00AF8AB8 in main
> 0x76BFD2E9 in BaseThreadInitThunk
> 0x76E91603 in RtlInitializeExceptionChain
> 0x76E915D6 in RtlInitializeExceptionChain
> 'qwertyuiop09813478'  is not recognized as an internal or external command,
> operable program or batch file.
>
> and I can't even compile Phobos on my win32 system, so I can't reduce this.
> For me, this is a blocker. I really don't know what to do.
>
>
> @David:
>  > […] Every import has a cost, and that cost is far from zero.
>  > Are imports really the primary cause for this? Might want to
> (double)check this before jumping to conclusions.
>
> No, the imports are the cause of the "big ball of mud" design of Phobos,
> but they're not the cause of the memory usage. It's the sheer number of
> template instantiations which seems to be the primary problem.
>
>
>
> On 8 June 2013 20:32, Brad Roberts <braddr@puremagic.com
> <mailto:braddr@puremagic.com>> wrote:
>
>     Well, impossible is a rather strong word, considering that the win32
>     auto-tester seems to be doing it's job successfully.
>
>     I _do_ consider it a compiler memory management issue.  There's no
>     reason that the entirety of phobos (or pretty much any app) ought to
>     be compilable in one shot.  There's approximately 10M of source for
>     druntime+phobos and some how it can't fit in a couple _gigs_ of ram?
>       And looking at std.algorithm, that's a mere 342k of source code.
>       Yes, it grows, but 4 orders of magnitude?
>
>     As to what to do, how much memory do you have and are you using the
>     snn.lib update that Walter released a few years ago that fixed the
>     memory allocator in it to not suck so bad?
>
>        md5: 9357508e541067ea34056dade4381d__d8 dmc/dm/lib/snn.lib
>
>
>     On 6/8/13 11:12 AM, Don Clugston wrote:
>
>         With win32, I can no longer run unittests:
>         ------------------
>         dmd -O -w -d -property -L/co -c -unittest -ofunittest5.obj
>         std\algorithm.d
>         Error: out of memory
>         ------------------
>         This has happened many times before, and we dealt with it by
>         reducing the number of modules we
>         compiled into each object file.  We once had 30 modules per obj
>         file. Then fifteen. Then five. But
>         now we're at one, that workaround can no longer be used.
>         The idea that we can continue to throw billions of templates and
>         imports into every Phobos module,
>         is leading us towards catastrophe.
>
>         Sure, you can say, the compiler should improve its memory
>         management. But I don't think that's
>         really the problem. If it was better, then compiler might not
>         run out of memory, but it would run
>         unusably slowly. I think most people have no idea of just how
>         many templates the compiler is being
>         asked to instantiate.
>         Every import has a cost, and that cost is far from zero.
>
>         I'm stuck, and I don't know what to do.
>
>
>         _________________________________________________
>         dmd-internals mailing list
>         dmd-internals@puremagic.com <mailto:dmd-internals@puremagic.com>
>         http://lists.puremagic.com/__mailman/listinfo/dmd-internals
>         <http://lists.puremagic.com/mailman/listinfo/dmd-internals>
>
>
>     _________________________________________________
>     dmd-internals mailing list
>     dmd-internals@puremagic.com <mailto:dmd-internals@puremagic.com>
>     http://lists.puremagic.com/__mailman/listinfo/dmd-internals
>     <http://lists.puremagic.com/mailman/listinfo/dmd-internals>
>
>
>
>
> _______________________________________________
> dmd-internals mailing list
> dmd-internals@puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 10, 2013
On 6/10/13 2:29 AM, Don Clugston wrote:
> It seems hard to believe there would be enough types to instantiate
> isNarrowString thousands of different times.

This would be a good starting point in diagnosing the matter. True, isRandomString is instantiated with many types because it's part of template constraints that ultimately fail. But then, thousands sounds a bit much.

Andrei

_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 10, 2013
On Monday, June 10, 2013 21:24:50 Andrei Alexandrescu wrote:
> On 6/10/13 2:29 AM, Don Clugston wrote:
> > It seems hard to believe there would be enough types to instantiate isNarrowString thousands of different times.
> 
> This would be a good starting point in diagnosing the matter. True, isRandomString is instantiated with many types because it's part of template constraints that ultimately fail. But then, thousands sounds a bit much.

Given how many ranges wrap other ranges and how that sort of thing is going to show up in template constraints and static ifs all over the place, I'm not sure that thousands is all that unlikely when you start unit testing - especially when you're trying to test every instantiation path (as we should be doing). Hopefully, there's some sort of bug here or something that we can optimize, but it wouldn't entirely surprise me if all of those instantiations were legit.

- Jonathan M Davis
_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 10, 2013
On 6/10/2013 2:25 AM, Martin Nowak wrote:
>
> The number of instances is also a huge performance issue when searching for existing instantiations, because search is done linear and comparing two TemplateInstance uses the costly arrayObjectMatch on the template arguments.

Once I finish with 2.063.2, I'll do some profiling and see about improving that.

I think arrayObjectMatch could back a hashed lookup. With the heavy template usage D is relying on, this will be a necessity.
_______________________________________________
dmd-internals mailing list
dmd-internals@puremagic.com
http://lists.puremagic.com/mailman/listinfo/dmd-internals

June 11, 2013
One area where template bloat can be reduced is using cannonical symbol for alias parater forwarding. Let me explain with a sample of code :

template A(alias foo) {}
template B(alias foo) {
    alias afoo = A!foo;
}

B!symbol;

In this case, A is instanciated with B!symbol.foo as alias parameter, not symbol. This cause a huge amunt of useless template instanciation when compiling SDC (and can somtime cause infinite template instanciation that crash the compiler).


2013/6/10 Don Clugston <dclugston@gmail.com>

>
> On 10 June 2013 02:28, Walter Bright <walter@digitalmars.com> wrote:
>
>>
>> On 6/9/2013 1:32 PM, Don Clugston wrote:
>>
>>>
>>> Yeah. 4 orders of magnitude. Our codebase at sociomantic is a bit larger
>>> than Phobos + druntime, but it compiles in just a few seconds.
>>> The problem is, that because of templates, the memory consumption isn't
>>> linear with source size.
>>>
>>> dmd -unittest -o- std/algorithm
>>>
>>> instantiates 344150 templates. Yes 344K. More than a third of a million.
>>>
>>
>> Wow! (How many of those are unique, rather than reusing an existing
>> instantiation?)
>
>
> I'm not sure. That's the number of calls to the constructor of
> TemplateInstance. I don't understand the code well enough to know
> if it can eventually gets merged with an existing TemplateInstance.
> If so, then perhaps there's something we could do to prevent them from
> getting created in the first place if they are duplicates.
> Certainly there are a huge number of instantiations of things like:
> hasLength, isNarrowString, isForwardRange
> It seems hard to believe there would be enough types to instantiate
> isNarrowString thousands of different times.
>
>
>  More than the number of lines of source in the module.
>> And yet there are only 1305 asserts in that module -- the tests are not particularly comprehensive.
>>
>>
> -cov shows 96% coverage for std.algorithm
>
>
> Yeah, my point was that for N tests, we get O(N^^2) templates instantiated.
>
>
> _______________________________________________
> dmd-internals mailing list
> dmd-internals@puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-internals
>