Compiler scalability. Question inspired by OOM errors seen by Crystal language - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » Compiler scalability. Question inspired by OOM errors seen by Crystal language

Thread overview

Compiler scalability. Question inspired by OOM errors seen by Crystal language
Aug 30, 2017 Pradeep Gowda
Aug 30, 2017 Nicholas Wilson
Aug 30, 2017 Petar Kirov [ZombineDev]
Aug 30, 2017 H. S. Teoh
Aug 31, 2017 Stefan Koch
Aug 31, 2017 H. S. Teoh
Aug 31, 2017 Stefan Koch
Aug 30, 2017 Petar Kirov [ZombineDev]
Aug 30, 2017 H. S. Teoh
Aug 30, 2017 Petar Kirov [ZombineDev]

August 30, 2017

Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by Pradeep Gowda

Pradeep Gowda

I'm referring to this thread about Crystal --  https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable

> Crystal is strongly typed, but overwhelmingly uses type inference, rather than explicit types. Because Crystal aims to be spiritually—and frequently literally—compatible with Ruby, that’s a problem: to accomplish that, Crystal relies on sometimes-nullable types with implicit structure and implicit unions, such that, frequently, the only way to even begin type inference is to load the entire program’s AST into RAM all at once and then start your massive type inference pass. What you’re seeing in this thread is how a “simple” fix to a YAML parser error reporting hit that problem, causing Crystal to use a critical amount too much RAM and OOM.

How does D compare in this regard, especially in cases where `auto` storage class specifiers are used liberally throughout the code base?

August 30, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by Nicholas Wilson
in reply to Pradeep Gowda

Nicholas Wilson

Posted in reply to Pradeep Gowda

On Wednesday, 30 August 2017 at 01:30:30 UTC, Pradeep Gowda wrote:
> I'm referring to this thread about Crystal --  https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable
>
>> Crystal is strongly typed, but overwhelmingly uses type inference, rather than explicit types. Because Crystal aims to be spiritually—and frequently literally—compatible with Ruby, that’s a problem: to accomplish that, Crystal relies on sometimes-nullable types with implicit structure and implicit unions, such that, frequently, the only way to even begin type inference is to load the entire program’s AST into RAM all at once and then start your massive type inference pass. What you’re seeing in this thread is how a “simple” fix to a YAML parser error reporting hit that problem, causing Crystal to use a critical amount too much RAM and OOM.
>
> How does D compare in this regard, especially in cases where `auto` storage class specifiers are used liberally throughout the code base?

Auto is not the problem you can easily figure out the return type of function that return a primitive type, and aggregates have to be specified.

The problem with D is the memory hogging nature of CTFE and the sheer number of templates that get instantiated when compiling big codebases. Symbol length is also a problem but that eats you dose space not your RAM.

August 30, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by Petar Kirov [ZombineDev]
in reply to Nicholas Wilson

Petar Kirov [ZombineDev]

Posted in reply to Nicholas Wilson

On Wednesday, 30 August 2017 at 02:53:42 UTC, Nicholas Wilson wrote:
> On Wednesday, 30 August 2017 at 01:30:30 UTC, Pradeep Gowda wrote:
>> I'm referring to this thread about Crystal --  https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable
>>
>>> Crystal is strongly typed, but overwhelmingly uses type inference, rather than explicit types. Because Crystal aims to be spiritually—and frequently literally—compatible with Ruby, that’s a problem: to accomplish that, Crystal relies on sometimes-nullable types with implicit structure and implicit unions, such that, frequently, the only way to even begin type inference is to load the entire program’s AST into RAM all at once and then start your massive type inference pass. What you’re seeing in this thread is how a “simple” fix to a YAML parser error reporting hit that problem, causing Crystal to use a critical amount too much RAM and OOM.
>>
>> How does D compare in this regard, especially in cases where `auto` storage class specifiers are used liberally throughout the code base?
>
> Auto is not the problem you can easily figure out the return type of function that return a primitive type, and aggregates have to be specified.
>
> The problem with D is the memory hogging nature of CTFE and the sheer number of templates that get instantiated when compiling big codebases. Symbol length is also a problem but that eats you dose space not your RAM.

Symbols do eat your RAM because, the compiler has to generate them in RAM before writing them to a binary, and also don't forget `pragma (msg, symbol.mangle)` - the mangled name of each symbol is available for use in CTFE. However the mangling problems should finally be put to rest, thanks to Rainer Schuetze, now that
https://github.com/dlang/dmd/pull/5855 / https://github.com/dlang/dmd/pull/6998 was merged.

August 30, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by Petar Kirov [ZombineDev]
in reply to Pradeep Gowda

Petar Kirov [ZombineDev]

Posted in reply to Pradeep Gowda

On Wednesday, 30 August 2017 at 01:30:30 UTC, Pradeep Gowda wrote:
> I'm referring to this thread about Crystal --  https://lobste.rs/s/dyitr0/its_fun_program_crystal_is_it_scalable
>
>> Crystal is strongly typed, but overwhelmingly uses type inference, rather than explicit types. Because Crystal aims to be spiritually—and frequently literally—compatible with Ruby, that’s a problem: to accomplish that, Crystal relies on sometimes-nullable types with implicit structure and implicit unions, such that, frequently, the only way to even begin type inference is to load the entire program’s AST into RAM all at once and then start your massive type inference pass. What you’re seeing in this thread is how a “simple” fix to a YAML parser error reporting hit that problem, causing Crystal to use a critical amount too much RAM and OOM.
>
> How does D compare in this regard, especially in cases where `auto` storage class specifiers are used liberally throughout the code base?

D supports separate compilation by design. I.e. it doesn't require all the source files corresponding to all the object files being linked to produce the final executable, to be loaded in memory by the compiler.

August 30, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by H. S. Teoh
in reply to Nicholas Wilson

H. S. Teoh

Posted in reply to Nicholas Wilson

On Wed, Aug 30, 2017 at 02:53:42AM +0000, Nicholas Wilson via Digitalmars-d wrote: [...]
> The problem with D is the memory hogging nature of CTFE and the sheer number of templates that get instantiated when compiling big codebases. Symbol length is also a problem but that eats you dose space not your RAM.

The symbol length problem has been greatly reduced by Rainer's recent mangle PR, which is now in git master and will be in the following release. There is still room for even more reduction but they will be comparatively minor. Rainer's fix has eliminated the bulk of the symbol length problem.

The CTFE problem will be fixed soon, once Stefan Koch finishes his newCTFE engine.

Templates are still an area where a lot of improvement can be made. But Stefan has indicated interest in visiting this area in the compiler after the newCTFE project is done.  So eventually we'll get there.


T

-- 
Famous last words: I *think* this will work...

August 30, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by H. S. Teoh
in reply to Petar Kirov [ZombineDev]

H. S. Teoh

Posted in reply to Petar Kirov [ZombineDev]

On Wed, Aug 30, 2017 at 08:01:17AM +0000, via Digitalmars-d wrote: [...]
> D supports separate compilation by design. I.e. it doesn't require all the source files corresponding to all the object files being linked to produce the final executable, to be loaded in memory by the compiler.

Yes, I think the general recommendation for very large codebases is to compile one package at a time, i.e., one subdir at a time.

Nevertheless, compiler memory usage remains an issue on low-memory systems.  My old work PC simply cannot handle running dmd without grinding to a halt because there's just not enough RAM to go around (only 500MB total).  It's generally not a problem for modern PCs with at least a few GB of memory.  Walter chose compilation speed over memory efficiency, so that's just the way it is.  In theory, one could turn on GC in the compiler so that it will work on low-memory systems, but I'm not sure if such a change will be accepted into dmd.


T

-- 
Questions are the beginning of intelligence, but the fear of God is the beginning of wisdom.

August 30, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by Petar Kirov [ZombineDev]
in reply to H. S. Teoh

Petar Kirov [ZombineDev]

Posted in reply to H. S. Teoh

On Wednesday, 30 August 2017 at 16:39:32 UTC, H. S. Teoh wrote:
> On Wed, Aug 30, 2017 at 08:01:17AM +0000, via Digitalmars-d wrote: [...]
>> D supports separate compilation by design. I.e. it doesn't require all the source files corresponding to all the object files being linked to produce the final executable, to be loaded in memory by the compiler.
>
> Yes, I think the general recommendation for very large codebases is to compile one package at a time, i.e., one subdir at a time.

Yep, that avoids parsing the same file over and over again (like would be the case in C++ with commonly used header files for each translation unit), yet provides enough granularity for parallelism and also somewhat limits the memory problem.

> Nevertheless, compiler memory usage remains an issue on low-memory systems.  My old work PC simply cannot handle running dmd without grinding to a halt because there's just not enough RAM to go around (only 500MB total).  It's generally not a problem for modern PCs with at least a few GB of memory.  Walter chose compilation speed over memory efficiency, so that's just the way it is.  In theory, one could turn on GC in the compiler so that it will work on low-memory systems, but

(Nods)

> I'm not sure if such a change will be accepted into dmd.
>
>
> T

We could always make that a compiler switch. I'm not sure if a substantial benefit could be gained during parsing and semantic analysis (e.g. consider staticIota - each instantiation needs to be kept in memory, since you never know when it will be needed again), but certainly once you convert the AST to the backend IR, you could immediately drop all memory allocated by the frontend.

August 31, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by Stefan Koch
in reply to H. S. Teoh

Stefan Koch

Posted in reply to H. S. Teoh

On Wednesday, 30 August 2017 at 16:34:13 UTC, H. S. Teoh wrote:
>
> The CTFE problem will be fixed soon, once Stefan Koch finishes his newCTFE engine.
>
> Templates are still an area where a lot of improvement can be made. But Stefan has indicated interest in visiting this area in the compiler after the newCTFE project is done.  So eventually we'll get there.
>
>
> T

Yes that is true.
However since I've just taken a full-time job I cannot say when it will be finished.

As for templates I have chosen to work around the issue entirely and integrate type-manipulation and introspection abilities with ctfe. (Which is inherently more scalable then templates)

August 31, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by H. S. Teoh
in reply to Stefan Koch

H. S. Teoh

Posted in reply to Stefan Koch

On Thu, Aug 31, 2017 at 04:36:41PM +0000, Stefan Koch via Digitalmars-d wrote:
> On Wednesday, 30 August 2017 at 16:34:13 UTC, H. S. Teoh wrote:
> > 
> > The CTFE problem will be fixed soon, once Stefan Koch finishes his newCTFE engine.
> > 
> > Templates are still an area where a lot of improvement can be made. But Stefan has indicated interest in visiting this area in the compiler after the newCTFE project is done.  So eventually we'll get there.
> > 
> > 
> > T
> 
> Yes that is true.
> However since I've just taken a full-time job I cannot say when it
> will be finished.
> 
> As for templates I have chosen to work around the issue entirely and integrate type-manipulation and introspection abilities with ctfe. (Which is inherently more scalable then templates)

I think D's conception of templates, which IMO goes beyond C++'s (conceptually, anyway, the current implementation is not too different fundamentally), can be implemented in a much better way that makes it more scalable.

C++ sees templates as code stencils to be instantiated with the given template arguments, i.e., copy-n-paste the stencil and substitute template parameters with the given arguments.

D's templates for the most part still retains that, and certainly the current implementation essentially just follows the C++ model. But IMO there's an important conceptual difference: D sees template parameters not so much as parameters for code stencils to be copy-n-pasted, but as *compile-time* parameters to a function / set of declarations.  I.e., they are parameters that are known at compile-time rather than runtime.

The distinction is subtle but IMO important, because it allows us to break out of the C++ mold of copy-n-paste-this-stencil that is one of the causes of template scalability problems (pass too many different template arguments, and you get massive code bloat, one copy per instantiation).  By treating template arguments as arguments known at compile-time instead, the compiler can get smarter with when to copy-n-paste a declaration, and when to just evaluate the template with the argument without actually instantiating anything (no copying of AST nodes, etc.).

A common example is a templated linked-list container, where much of the linked-list pointer manipulation code is actually independent of the type of the contained data.  The compiler really only needs to generate distinct copies of the code when the data itself is being manipulated; all the other code can be shared between template instantiations.

This is just one example off the top of my head; I'm sure there are plenty of others that we can come up with once we break free of the C++ mold of thinking of templates as "copy-n-paste except substitute X with Y".  Another example that just occurred to me is the various recursive templates in std.meta for manipulating AliasSeq's.  There is actually no need for the compiler to instantiate anything at all, except perhaps for the final result. By treating the AliasSeq as an in-compiler data type with compile-time operations performed on it, the compiler ought to be able to evaluate the template without needing to instantiate anything past the first level of recursion.

T

-- 
"How are you doing?" "Doing what?"

August 31, 2017

Re: Compiler scalability. Question inspired by OOM errors seen by Crystal language

Posted by Stefan Koch
in reply to H. S. Teoh

Stefan Koch

Posted in reply to H. S. Teoh

On Thursday, 31 August 2017 at 17:39:35 UTC, H. S. Teoh wrote:

>
> This is just one example off the top of my head; I'm sure there are plenty of others that we can come up with once we break free of the C++ mold of thinking of templates as "copy-n-paste except substitute X with Y".  Another example that just occurred to me is the various recursive templates in std.meta for manipulating AliasSeq's.  There is actually no need for the compiler to instantiate anything at all, except perhaps for the final result. By treating the AliasSeq as an in-compiler data type with compile-time operations performed on it, the compiler ought to be able to evaluate the template without needing to instantiate anything past the first level of recursion.
>
>
> T

That is a good example and it is one where we could maybe deduce what is doing on.
However the classification of a template (rather the deduction of unused intanciations) actually requires us to instantiate the template.
Because we cannot predict to what it might evaluate given specific parameters.
For this to work we have to produce all possible instantiation behaviors of a template, and deduplicate this set.
Which happens to be infinite, in most cases.
Even predicting if the set of instantiations can be finite is essentially an instance of the halting problem.

I've worked on this for some time until I realized that I was fighting without gaining ground.

It would end up a myriad of special cases and would be impossible to maintain.

Considering that the current template system which does not try anything smart is already fiendishly complicated, it'd rather not introduce an even more complicated template-instance-shape-predictor which will probably only work on the simplest of cases and is not guaranteed to bring wins that would justify the time it'll take to implement it.

TL;DR

The change to optimize certain groups of templates will require MASSIVE amounts of work and is unlikely to pay for itself.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation