December 16, 2011
On 2011-12-16 20:23, Steven Schveighoffer wrote:
>
> I disagree with this assessment. It's good to know the cause of the
> problem, but let's look at the root issue -- reflection. The only reason
> to include class information for classes not being referenced is to be
> able to construct/use classes at runtime instead of at compile time. But
> if you look at D's runtime reflection capabilities, they are quite poor.
> You can only construct a class at runtime if it has a zero-arg constructor.

It's not very useful as is, but you can create your own version that doesn't call the constructor and that can be more useful sometimes. I'm using that technique in my serialization library and providing a special method that can act as a constructor.


-- 
/Jacob Carlborg
December 16, 2011
On Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:
> On 12/16/11 1:41 PM, Jonathan M Davis wrote:
> I understand and empathize with the sentiment, and I agree with most of
> the technical points at face value, save for a few details. But there
> are other things at stake.
> 
> Consider scope. Many arguments applicable to application code are not quite fit for the standard library. The stdlib is the connection between the compiler innards, the runtime innards, and the OS innards all meet, and the role of the stdlib is to provide nice abstractions to client code. Inside the stdlib it's entirely expected to find things like __traits most nobody heard of, casts, and other things that would be normally shunned in application code. I'd be more worried if there was no possibility to do what we need to do. The standard library is not a place to play it nice. We can't afford to say "well yeah everyone's binary is bloated and slower to start but we didn't like the cast that would have taken care of that".

I'm not completely against this precisely because of this, but at the same time, it strikes me as completely ridiculous to have to resort to some nasty casting simply to reduce the binary size of the base executable. I'd much rather see the compiler improved such that this isn't necessary.

> As another matter, there is value in minimizing compulsive work during library startup. Consider for example this code in std.datetime:
> 
> shared static this()
> {
> tzset();
> _localTime = new immutable(LocalTime)();
> }
> 
> This summons the garbage collector right off the bat, thus wiping off anyone's chance of compiling and linking without a GC - as many people seem to want to do. And that happens not to programs that import and use std.datetime, but to program using any part of the standard library that transitively imports std.datetime, even for the most innocuous uses, and even if they never, ever use _localtime! That one line essentially locks out 75% of the standard library to anyone wishing to ever avoid using the GC.

This, on the other hand, is of much greater concern, and is a much better argument for using the ugly casting necessary to get rid of the static constructors, even if the compiler did a fanastic job at cutting out the extra cruft in the binary - though as far as the GC goes, it might not be an issue once CTFE is good enough to create classes at compile time that still exist at runtime. Unfortunately, the necessity of tzset would remain however.

> > And honestly, I think that a far worse problem with static constructors is circular dependencies. _That_ is something that needs to be addressed with regards to static constructors. In general at this point, it's looking like static constructors are turning out to be a bit of a failure on some level, given the issues that we're having because of them, and I think that we should fix the language and/or compiler so that they _aren't_ a failure.
> 
> Here I totally disagree. The design is sound. The issues discussed here are entirely detail implementation artifacts.

As far as the binary size goes, I completely agree that it's an implementation issue, but I definitely think that the issues with circular dependencies is a design issue which needs to be addressed. The basics of static constructors wouldn't have to change drastically, but there should at least be a way to indicate to the compiler that there is not actually a circular dependency. I don't think that I have ever seen druntime blow up on a circular dependency where there was actually a circular dependency. It's just that the compiler (or druntime or both) isn't smart enough to determine whether the static constructors _actually_ create a circular dependency. It has no way of determining which module's static constructors should be called first and givse up. We need a way to give it that information so that it can order them when they aren't actually interdependent. _That_ is the design flaw that I see in static constructors, and it's one of the most annoying issues in the language IMHO (which arguably just goes to show how good D is in general, I suppose).

- Jonathan M Davis
December 16, 2011
On 2011-12-16 21:49, Andrei Alexandrescu wrote:
> On 12/16/11 2:47 PM, Jacob Carlborg wrote:
>> I don't think it's completely separate. Can the compiler know if runtime
>> reflection is used or not?
>
> Yes. Reflection is used if reflection primitive functions are called.
>
> Andrei

Yeah, but how does the compiler know which are primitive functions, hard code them in the compiler? Or perhaps the compiler already need to know this.

-- 
/Jacob Carlborg
December 16, 2011
On Fri, 16 Dec 2011 14:48:33 -0500, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> On 12/16/11 1:23 PM, Steven Schveighoffer wrote:
>> I disagree with this assessment. It's good to know the cause of the
>> problem, but let's look at the root issue -- reflection. The only reason
>> to include class information for classes not being referenced is to be
>> able to construct/use classes at runtime instead of at compile time. But
>> if you look at D's runtime reflection capabilities, they are quite poor.
>> You can only construct a class at runtime if it has a zero-arg constructor.
>>
>> So essentially, we are paying the penalty of having runtime reflection
>> in terms of bloat, but get very very little benefit.
>
> I'd almost agree, but the code showed doesn't use Object.factory(). So that shouldn't be linked in, and shouldn't pull vtables.

You cannot know until link time whether factory is used when compiling individual files.  By then it's probably too late to exclude them.  The point is that you can instantiate unreferenced classes simply by calling them out by name.

>
>> I think there are two things that need to be considered:
>>
>> 1. We eventually should have some reasonably complete runtime reflection
>> capability
>> 2. Runtime reflection and shared libraries go hand-in-hand. With shared
>> library support, the bloat penalty isn't nearly as significant.
>>
>> I don't think the right answer is to avoid using features of the
>> language because the compiler/runtime has some design deficiencies. At
>> some point these deficiencies will be fixed, and then we are left with a
>> library that has seemingly odd design choices that we can't change.
>
> Runtime reflection is great, but I think it's a separate issue from what's discussed here.

I'm not pushing for runtime reflection, all I'm saying is, I don't think it's worth changing how the library is written to work around something because the *compiler* is incorrectly implemented/designed.

So why don't we just leave the code size situation as-is?  500kb is not a terribly significant amount, but dlls are on the horizon (Walter has publicly said so).  Then size becomes a moot point.

If we get reflection, then you will find that having excluded all the runtime information when not used is going to hamper D's reflection capability, and we'll probably have to start putting it back in anyway.

In short, dlls will solve the problem, let's work on that instead of shuffling around code.

-Steve
December 16, 2011
A related issue is phobos being an intermodule dependency monster.
A simple hello world pulls in almost 30 modules!
And std.stdio is supposed to be just a simple wrapper around C FILE.
December 16, 2011
On Fri, 16 Dec 2011 16:28:03 -0500, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> So why don't we just leave the code size situation as-is?  500kb is not a terribly significant amount, but dlls are on the horizon (Walter has publicly said so).  Then size becomes a moot point.
>
> If we get reflection, then you will find that having excluded all the runtime information when not used is going to hamper D's reflection capability, and we'll probably have to start putting it back in anyway.
>
> In short, dlls will solve the problem, let's work on that instead of shuffling around code.

The other valid option I see is removing the link to the virtual tables, thereby disabling reflection via factory until we can implement full reflection.

-Steve
December 16, 2011
On 12/16/2011 09:31 PM, Jonathan M Davis wrote:
> On Friday, December 16, 2011 21:06:49 Timon Gehr wrote:
>> On 12/16/2011 08:41 PM, Jonathan M Davis wrote:
>>> On Friday, December 16, 2011 12:29:18 Andrei Alexandrescu wrote:
>>>> Jonathan, could I impose on you to replace all static cdtors in
>>>> std.datetime with lazy initialization? I looked through it and it
>>>> strikes me as a reasonably simple job, but I think you'd know better
>>>> what to do than me.
>>>>
>>>> A similar effort could be conducted to reduce or eliminate static
>>>> cdtors
>>>> from druntime. I made the experiment of commenting them all, and that
>>>> reduced the size of the baseline from 218KB to 200KB. This is a good
>>>> amount, but not as dramatic as what we can get by working on
>>>> std.datetime.>
>>> Hmm. I had reply for this already, but it seems to have disappeared, so
>>> I'll try again.
>>>
>>> You could make core.time use property functions instead of the static
>>> immutable variables that it's using now for ticksPerSec and appOrigin,
>>> but in order to do that right would require introducing a mutex or
>>> synchronized block (which is really just a mutex under the hood
>>> anyway), and I'm loathe to do that in time-related code. ticksPerSec
>>> gets used all over the place in TickDuration, and that could have a
>>> negative impact on performance for something that needs to be really
>>> fast (since it's used in stuff like StopWatch and benchmarking). On top
>>> of that, in order to maintain the current semantics, the property
>>> functions would have to be pure, which they can't be without doing some
>>> nasty casting to convince the compiler that stuff which isn't pure is
>>> actually pure.
>>
>> lazy variables would resolve this.
>
> True, but we don't have them.
>
>> Circular dependencies are not to be blamed on the design of static
>> constructors.
>
> Yes they are.

No. They arise from the design of the module hierarchy.

> static constructors completely chicken out on them. Not only is
> there no real attempt to determine whether the static constructors are
> actually dependent (which granted, isn't an easy problem),

I don't think that is an option.

> but there is _zero_ support in the language for resolving such circular dependencies. There's no
> way to say that they _aren't_ dependent even if you can clearly see that they
> aren't.

Yes there is. The compiler and runtime understand that they are not mutually dependent if their modules are not mutually dependent. Package level is the right level for dealing with such issues because the circular dependencies are a modularity problem.

> The solution used in Phobos (which won't work in std.datetime due to
> the use of immutable and pure) is to create a C module which has the code from
> the static constructor and then have a separate module which calls it in its
> static constructor.

You don't need a C function if you just factor out every variable it initializes to the separate D module. __fileinit.d works that way. I don't see why stdiobase.d could not do the same.

> It works, but it's not pretty (and it doesn't always work
> - e.g. std.datetime), and it would be _far_ better if you could just mark a
> static constructor as not depending on anything or mark it as not depending on
> a specific module or something similar.

How would that be checked?

> And given how disgusting it generally
> is to even figure out what's causing a circular dependency when the runtime
> won't start your program because of it, I really think that this is a problem
> which should resolved. static constructors need to be improved.
>

Nobody has figured out how to solve the problem of modular global data initialization. That is because there probably is no solution.

>>> In general at this point, it's looking like
>>> static constructors are turning out to be a bit of a failure on some
>>> level, given the issues that we're having because of them, and I think
>>> that we should fix the language and/or compiler so that they _aren't_ a
>>> failure.
>>>
>>> - Jonathan M Davis
>>
>> We are having (minor!!) problems because the task of initializing global
>> data in a modular way is inherently hard.
>>
>> Just have a look how other languages handle initialization of global
>> data and you'll notice that the D solution is actually very sensible.
>
> Yes. The situation with D is better than that of many other languages, but
> what prodblems we do have can be _really_ annoying to deal with. Have to deal
> with circular dependencies due to static module constructors which aren't
> actually interdependent is one of the most annoying issues in D IMHO.
>

Adding a language construct that turns off the checking entirely (as you seem to suggest) is not at all better than having to create a few additional source files.



December 16, 2011
On Fri, 16 Dec 2011 15:58:28 -0500, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:

>> As another matter, there is value in minimizing compulsive work during
>> library startup. Consider for example this code in std.datetime:
>>
>> shared static this()
>> {
>> tzset();
>> _localTime = new immutable(LocalTime)();
>> }
>>
>> This summons the garbage collector right off the bat, thus wiping off
>> anyone's chance of compiling and linking without a GC - as many people
>> seem to want to do. And that happens not to programs that import and use
>> std.datetime, but to program using any part of the standard library that
>> transitively imports std.datetime, even for the most innocuous uses, and
>> even if they never, ever use _localtime! That one line essentially locks
>> out 75% of the standard library to anyone wishing to ever avoid using
>> the GC.
>
> This, on the other hand, is of much greater concern, and is a much better
> argument for using the ugly casting necessary to get rid of the static
> constructors, even if the compiler did a fanastic job at cutting out the extra
> cruft in the binary - though as far as the GC goes, it might not be an issue
> once CTFE is good enough to create classes at compile time that still exist at
> runtime. Unfortunately, the necessity of tzset would remain however.

This can be solved with malloc and emplace

-Steve
December 16, 2011
On 12/16/11 3:38 PM, Trass3r wrote:
> A related issue is phobos being an intermodule dependency monster.
> A simple hello world pulls in almost 30 modules!
> And std.stdio is supposed to be just a simple wrapper around C FILE.

In fact it doesn't (after yesterday's commit). The std code in hello, world is a minuscule 3KB. The rest of 218KB is druntime.

Once we solve the static constructor issue, function-level linking should take care of pulling only the minimum needed.

One interesting fact is that a lot of issues that I tended to take non-critically ("templates cause bloat", "intermodule dependencies cause bloat", "static linking creates large programs") looked a whole lot differently when I looked closer at causes and effects.


Andrei
December 16, 2011
On 12/16/11 3:43 PM, Steven Schveighoffer wrote:
> On Fri, 16 Dec 2011 15:58:28 -0500, Jonathan M Davis
> <jmdavisProg@gmx.com> wrote:
>
>> On Friday, December 16, 2011 14:44:48 Andrei Alexandrescu wrote:
>
>>> As another matter, there is value in minimizing compulsive work during
>>> library startup. Consider for example this code in std.datetime:
>>>
>>> shared static this()
>>> {
>>> tzset();
>>> _localTime = new immutable(LocalTime)();
>>> }
>>>
>>> This summons the garbage collector right off the bat, thus wiping off
>>> anyone's chance of compiling and linking without a GC - as many people
>>> seem to want to do. And that happens not to programs that import and use
>>> std.datetime, but to program using any part of the standard library that
>>> transitively imports std.datetime, even for the most innocuous uses, and
>>> even if they never, ever use _localtime! That one line essentially locks
>>> out 75% of the standard library to anyone wishing to ever avoid using
>>> the GC.
>>
>> This, on the other hand, is of much greater concern, and is a much better
>> argument for using the ugly casting necessary to get rid of the static
>> constructors, even if the compiler did a fanastic job at cutting out
>> the extra
>> cruft in the binary - though as far as the GC goes, it might not be an
>> issue
>> once CTFE is good enough to create classes at compile time that still
>> exist at
>> runtime. Unfortunately, the necessity of tzset would remain however.
>
> This can be solved with malloc and emplace

Sure you meant static ubyte[__traits(classInstanceSize, T)]
and emplace :o).

Andrei