December 16, 2011
On Friday, 16 December 2011 at 21:28:03 UTC, Steven Schveighoffer wrote:
> In short, dlls will solve the problem, let's work on that instead of shuffling around code.

I wouldn't want to cripple either - put all the reflection
info in the dll, but keep it sufficiently decoupled so the
linker can strip it out when statically linking.

The effort in decoupling most the code isn't great.
December 16, 2011
Am 16.12.2011, 22:45 Uhr, schrieb Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org>:

> On 12/16/11 3:38 PM, Trass3r wrote:
>> A related issue is phobos being an intermodule dependency monster.
>> A simple hello world pulls in almost 30 modules!
>> And std.stdio is supposed to be just a simple wrapper around C FILE.
>
> In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime.

Yep, the 30 modules is a measure I took before that commit.

> Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed.

Also by pulling in I just meant the imports.
But the planned lazy semantic analysis should improve the situation.
December 16, 2011
Andrei Alexandrescu Wrote:

> On 12/16/11 3:38 PM, Trass3r wrote:
> > A related issue is phobos being an intermodule dependency monster.
> > A simple hello world pulls in almost 30 modules!
> > And std.stdio is supposed to be just a simple wrapper around C FILE.
> 
> In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime.
> 
> Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed.
> 
> One interesting fact is that a lot of issues that I tended to take non-critically ("templates cause bloat", "intermodule dependencies cause bloat", "static linking creates large programs") looked a whole lot differently when I looked closer at causes and effects.
> 
> 
> Andrei

http://wiki.freepascal.org/Size_Matters

Otherwise a great language that never did manage to remove "bloated" factor from its name. Many people stopped using it because of that, including me. I guess people do not like bloat when programming systems stuff.

December 16, 2011
On 12/16/11 3:28 PM, Steven Schveighoffer wrote:
> On Fri, 16 Dec 2011 14:48:33 -0500, Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org> wrote:
>
>> On 12/16/11 1:23 PM, Steven Schveighoffer wrote:
>>> I disagree with this assessment. It's good to know the cause of the
>>> problem, but let's look at the root issue -- reflection. The only reason
>>> to include class information for classes not being referenced is to be
>>> able to construct/use classes at runtime instead of at compile time. But
>>> if you look at D's runtime reflection capabilities, they are quite poor.
>>> You can only construct a class at runtime if it has a zero-arg
>>> constructor.
>>>
>>> So essentially, we are paying the penalty of having runtime reflection
>>> in terms of bloat, but get very very little benefit.
>>
>> I'd almost agree, but the code showed doesn't use Object.factory(). So
>> that shouldn't be linked in, and shouldn't pull vtables.
>
> You cannot know until link time whether factory is used when compiling
> individual files. By then it's probably too late to exclude them.

I'm not an expert in linkers, but my understanding is that linkers naturally remove unused object files. That, coupled with dmd's ability to break compilation output in many pseudo-object files, would take care of the matter. Truth be told, once you link in Object.factory(), bam - all classes are linked.

> The point is that you can instantiate unreferenced classes simply by
> calling them out by name.

Yah, but you must call a function to do that.

> I'm not pushing for runtime reflection, all I'm saying is, I don't think
> it's worth changing how the library is written to work around something
> because the *compiler* is incorrectly implemented/designed.
>
> So why don't we just leave the code size situation as-is? 500kb is not a
> terribly significant amount, but dlls are on the horizon (Walter has
> publicly said so). Then size becomes a moot point.
>
> If we get reflection, then you will find that having excluded all the
> runtime information when not used is going to hamper D's reflection
> capability, and we'll probably have to start putting it back in anyway.
>
> In short, dlls will solve the problem, let's work on that instead of
> shuffling around code.

I think there are more issues with static this() than simply executable size, as discussed. Also, adding dynamic linking capability does not mean we give up on static linking. A lot of programs use static linking by choice, and for good reasons.


Andrei


December 16, 2011
On 12/16/2011 10:53 PM, Trass3r wrote:
> Am 16.12.2011, 22:45 Uhr, schrieb Andrei Alexandrescu
> <SeeWebsiteForEmail@erdani.org>:
>
>> On 12/16/11 3:38 PM, Trass3r wrote:
>>> A related issue is phobos being an intermodule dependency monster.
>>> A simple hello world pulls in almost 30 modules!
>>> And std.stdio is supposed to be just a simple wrapper around C FILE.
>>
>> In fact it doesn't (after yesterday's commit). The std code in hello,
>> world is a minuscule 3KB. The rest of 218KB is druntime.
>
> Yep, the 30 modules is a measure I took before that commit.
>
>> Once we solve the static constructor issue, function-level linking
>> should take care of pulling only the minimum needed.
>
> Also by pulling in I just meant the imports.
> But the planned lazy semantic analysis should improve the situation.

I think it is already lazy?
---
module a;

void foo(){
    imanundefinedsymbolandcauseacompileerror();
}
---
---
module b;
import a;

void main(){
    foo();
}
---

$ dmd -c b # compiles fine

December 16, 2011
On Friday, 16 December 2011 at 21:45:43 UTC, Andrei Alexandrescu wrote:
> Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed.

This sounds fantastic.

> One interesting fact is that a lot of issues that I tended to take non-critically ("templates cause bloat", "intermodule dependencies cause bloat", "static linking creates large programs") looked a whole lot differently when I looked closer at causes and effects.

I'd be careful to overgeneralize from this though; templates
do have the potential to bloat things up, etc. Though
static linking has and always shall rok.


(For bloated templates, I had a monster of one in web.d
that shrunk the binary by about three megabytes by refactoring
some of it into regular functions. Shaved two seconds off the
compile time too! Note this binary is my work project, so your
results may vary with my library.

It was basically inlining several kilobytes of the same stuff
into hundreds of different functions... 10 kb * 300 functions
= lots of code.)
December 16, 2011
On Fri, 16 Dec 2011 17:00:45 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 12/16/11 3:28 PM, Steven Schveighoffer wrote:
>> On Fri, 16 Dec 2011 14:48:33 -0500, Andrei Alexandrescu
>> <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> On 12/16/11 1:23 PM, Steven Schveighoffer wrote:
>>>> I disagree with this assessment. It's good to know the cause of the
>>>> problem, but let's look at the root issue -- reflection. The only reason
>>>> to include class information for classes not being referenced is to be
>>>> able to construct/use classes at runtime instead of at compile time. But
>>>> if you look at D's runtime reflection capabilities, they are quite poor.
>>>> You can only construct a class at runtime if it has a zero-arg
>>>> constructor.
>>>>
>>>> So essentially, we are paying the penalty of having runtime reflection
>>>> in terms of bloat, but get very very little benefit.
>>>
>>> I'd almost agree, but the code showed doesn't use Object.factory(). So
>>> that shouldn't be linked in, and shouldn't pull vtables.
>>
>> You cannot know until link time whether factory is used when compiling
>> individual files. By then it's probably too late to exclude them.
>
> I'm not an expert in linkers, but my understanding is that linkers naturally remove unused object files. That, coupled with dmd's ability to break compilation output in many pseudo-object files, would take care of the matter. Truth be told, once you link in Object.factory(), bam - all classes are linked.

Factory doesn't directly reference classes, it does so through the moduleinfo tree/array (not sure what it is).  So the way it works is, the linker includes the module info because it's defined as static data, which includes the vtable functions, and factory can instantiate non-referenced classes because of this fact, not the other way around.

>> I'm not pushing for runtime reflection, all I'm saying is, I don't think
>> it's worth changing how the library is written to work around something
>> because the *compiler* is incorrectly implemented/designed.
>>
>> So why don't we just leave the code size situation as-is? 500kb is not a
>> terribly significant amount, but dlls are on the horizon (Walter has
>> publicly said so). Then size becomes a moot point.
>>
>> If we get reflection, then you will find that having excluded all the
>> runtime information when not used is going to hamper D's reflection
>> capability, and we'll probably have to start putting it back in anyway.
>>
>> In short, dlls will solve the problem, let's work on that instead of
>> shuffling around code.
>
> I think there are more issues with static this() than simply executable size, as discussed. Also, adding dynamic linking capability does not mean we give up on static linking. A lot of programs use static linking by choice, and for good reasons.

Even statically linked programs might use runtime reflection.

I agree the issue is not static linking vs. dynamic linking, but dynamic linking would hide the problem quite well.

Note that on Linux today, the executable is not truly static -- OS libs are dynamically linked.

Another option is to disable runtime reflection via a compiler switch (which would sever the ties between moduleinfo and classinfo).  Then we simply must make sure we don't use factory in the library anywhere.

-Steve
December 16, 2011
On Fri, 16 Dec 2011 16:48:18 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 12/16/11 3:43 PM, Steven Schveighoffer wrote:
>> On Fri, 16 Dec 2011 15:58:28 -0500, Jonathan M Davis
>> <jmdavisProg@gmx.com> wrote:
>>
>>> On Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:
>>
>>>> As another matter, there is value in minimizing compulsive work during
>>>> library startup. Consider for example this code in std.datetime:
>>>>
>>>> shared static this()
>>>> {
>>>> tzset();
>>>> _localTime = new immutable(LocalTime)();
>>>> }
>>>>
>>>> This summons the garbage collector right off the bat, thus wiping off
>>>> anyone's chance of compiling and linking without a GC - as many people
>>>> seem to want to do. And that happens not to programs that import and use
>>>> std.datetime, but to program using any part of the standard library that
>>>> transitively imports std.datetime, even for the most innocuous uses, and
>>>> even if they never, ever use _localtime! That one line essentially locks
>>>> out 75% of the standard library to anyone wishing to ever avoid using
>>>> the GC.
>>>
>>> This, on the other hand, is of much greater concern, and is a much better
>>> argument for using the ugly casting necessary to get rid of the static
>>> constructors, even if the compiler did a fanastic job at cutting out
>>> the extra
>>> cruft in the binary - though as far as the GC goes, it might not be an
>>> issue
>>> once CTFE is good enough to create classes at compile time that still
>>> exist at
>>> runtime. Unfortunately, the necessity of tzset would remain however.
>>
>> This can be solved with malloc and emplace
>
> Sure you meant static ubyte[__traits(classInstanceSize, T)]
> and emplace :o).

That works too!

-Steve
December 16, 2011
On Fri, 16 Dec 2011 16:48:47 -0500, Adam D. Ruppe <destructionator@gmail.com> wrote:

> On Friday, 16 December 2011 at 21:28:03 UTC, Steven Schveighoffer wrote:
>> In short, dlls will solve the problem, let's work on that instead of shuffling around code.
>
> I wouldn't want to cripple either - put all the reflection
> info in the dll, but keep it sufficiently decoupled so the
> linker can strip it out when statically linking.
>
> The effort in decoupling most the code isn't great.

The only way I can think of to decouple it is to disable it with a compiler switch, since the compiler is the one including the info.

I envision a nasty world where libraries are built 4 ways, with two orthogonal factors -- dynamic vs. static, and reflection vs. no reflection.  Oh, hello visual C++, what are you doing here?

-Steve
December 16, 2011
On 16.12.2011 22:28, Steven Schveighoffer wrote:
> In short, dlls will solve the problem, let's work on that instead of
> shuffling around code.

How exactly do they solve the problem?  An exe plus a DLL version of the library will usually be larger than just a statically linked exe.