December 20, 2011
On 12/20/11 9:00 AM, Denis Shelomovskij wrote:
> 16.12.2011 21:29, Andrei Alexandrescu пишет:
[snip]
> Really sorry, but it sounds silly for me. It's a minor problem. Does
> anyone really cares about 600 KiB (3.5x) size change in an empty
> program? Yes, he does, but only if there is no other size increases in
> real programs.

In my experience, in a system programming language people do care about baseline size for one reason or another. I'd agree the reason is often overstated. But I did notice that people take a look at D and use "hello, world" size as a proxy for language's overall overhead - runtime, handling of linking etc. You may or may not care about the conclusions of our investigation, but we and a category of people do care for a variety of project sizes and approaches to building them.

> Now dmd have at least _two order of magnitude_ file size increase. I
> posted that problem four months ago at "Building GtkD app on Win32
> results in 111 MiB file mostly from zeroes".
[snip]
> ---
> char arr[1024 * 1024 * 10];
> void main() { }
> ---
[snip]
> If described issues aren't much more significant than "static this()",
> show me where am I wrong, please.

Using BSS is a nice optimization, but not all compilers do it and I know for a fact MSVC didn't have it for a long time. That's probably why I got used to thinking "poor style" when seeing a large statically-sized buffer with static duration.

I'd say both issues deserve to be looked at, and saying one is more significant than the other would be difficult.


Andrei
December 20, 2011
Am 19.12.2011, 20:43 Uhr, schrieb Jacob Carlborg <doob@me.com>:

> It could be useful for a package manager. Theoretically all installed packages could share the same dynamic library. But I would guess the the packages would depend on different versions of the library and the package manager would end up installing a whole bunch of different versions of the Phobos and druntime.

No! Let's please try to get closer to something that works with package managers than the situation on Windows.

On Windows I see few applications that install libraries separately, unless they started on Linux or the libraries are established like DirectX. In the past DLLs from newly installed programs used to overwrite existing DLLs. IIRC the DLLs were then checked for their versions by installers, so they are never downgraded, but that still broke some applications with library updates that changed the API. Starting with Vista, there is the winsxs difrectory that - as I understand it - keeps a copy of every version of every dll associated to the programs that installed/use them.

Package managers are close to my ideal world:
- different API versions (major revisions) can be installed in parallel
- applications link to the API version they were designed for
- bug fixes replace the old DLL for the whole system, all applications benefit
- RAM is shared between applications that use the same DLL

I'd think it would be bad to make cuts here. If you cannot even imagine an operating system with 1000 little apps like type/cat, cp/copy, sed etc... written in D, because they would all link statically against the runtime and cause major bloat, then that is turning off another few % of C users and purists. You don't drive an off-road car, because you go off-roads so often, but because you could imagine it. (Please buy small cars for city use.)

Linking against different library versions goes in practice like this:
There is at least one version installed, maybe libphobos2.so.1.057. The 1 would be a major revision (one where hard deprecations occur), then there is a link named libphobos2.so.1 to that file, that all applications using API version 1 link against. So the actual file can be updated to libphobos2.so.1.058 without recompiles or breakage.
December 20, 2011
Am 19.12.2011, 19:08 Uhr, schrieb Walter Bright <newshound2@digitalmars.com>:

> On 12/16/2011 2:55 PM, Walter Bright wrote:
>> For example, in std.datetime there's "final class Clock". It inherits nothing,
>> and nothing can be derived from it. The comments for it say it is merely a
>> namespace. It should be a struct.
>
> Or perhaps it should be in its own module.

When I first saw it I thought "That's how _Java_ goes about free functions: Make it a class." :)
December 20, 2011
On Tuesday, 20 December 2011 at 20:51:38 UTC, Marco Leise wrote:
> Am 19.12.2011, 20:43 Uhr, schrieb Jacob Carlborg <doob@me.com>:
> On Windows I see few applications that install libraries separately, unless they started on Linux or the libraries are established like DirectX. In the past DLLs from newly installed programs used to overwrite existing DLLs. IIRC the DLLs were then checked for their versions by installers, so they are never downgraded, but that still broke some applications with library updates that changed the API. Starting with Vista, there is the winsxs difrectory that - as I understand it - keeps a copy of every version of every dll associated to the programs that installed/use them.

Minor nitpick:  winsxs has been around since XP.
December 20, 2011
Am 20.12.2011, 16:00 Uhr, schrieb Denis Shelomovskij <verylonglogin.reg@gmail.com>:

> The second dmd issue (that was discovered because of 99.00% of zeros) is that _it doesn't use bss section_.
> Lets look at the C++ program built using Microsoft's cl:
> ---
> char arr[1024 * 1024 * 10];
> void main() { }
> ---
> It resultis in ~10KiB executable, because `arr` is initialized with zero bytes and put in bss section. If one of its elements is set to non-zero:
> ---
> char arr[1024 * 1024 * 10] = { 1 };
> void main() { }
> ---
> The array can't be in .bss any more and resulting executable size will be increased by adding ~10MiB. The following D program results in ~10MiB executable:
> ---
> ubyte[1024 * 1024 * 10] arr;
> void main() { }
> ---
> So, if there really is a reason not to use .bss, it should be clearly explained.
>
>
>
> If described issues aren't much more significant than "static this()", show me where am I wrong, please.

+1. I didn't know about .bss, but static arrays of zeroes (global, struct, class) increasing the executable size looked like a problem wanting a solution. I hope it is easy to solve for dmd and is just an unimportant issue, so was never implemented.
December 20, 2011
On 12/20/2011 6:23 AM, Andrei Alexandrescu wrote:
> On 12/20/11 9:00 AM, Denis Shelomovskij wrote:
>> Now dmd have at least _two order of magnitude_ file size increase. I
>> posted that problem four months ago at "Building GtkD app on Win32
>> results in 111 MiB file mostly from zeroes".
> [snip]
>> ---
>> char arr[1024 * 1024 * 10];
>> void main() { }
>> ---
> [snip]
>> If described issues aren't much more significant than "static this()",
>> show me where am I wrong, please.
>
> Using BSS is a nice optimization, but not all compilers do it and I know for a
> fact MSVC didn't have it for a long time. That's probably why I got used to
> thinking "poor style" when seeing a large statically-sized buffer with static
> duration.
>
> I'd say both issues deserve to be looked at, and saying one is more significant
> than the other would be difficult.

First off, dmd most definitely puts 0 initialized static data into the BSS segment. So what's going on here?

1. char data is not initialized to 0, it is initialized to 0xFF. Non-zero data cannot be put in BSS.

2. Static data goes, by default, into thread local storage. BSS data is not thread local. To put it in global data, it has to be declared with __gshared.

So,

__gshared byte arr[1024 * 1024 *10];

will go into BSS.

There is pretty much no reason to have such huge arrays in static data. Instead, dynamically allocate them.

December 20, 2011
On 12/20/2011 1:07 PM, Marco Leise wrote:
> +1. I didn't know about .bss, but static arrays of zeroes (global, struct,
> class) increasing the executable size looked like a problem wanting a solution.
> I hope it is easy to solve for dmd and is just an unimportant issue, so was
> never implemented.

I added a faq entry for this.
December 20, 2011
On Tuesday, 20 December 2011 at 14:01:04 UTC, Denis Shelomovskij wrote:
> Detailed description:
> GtkD is built using singe (gtk-one-obj.lib) or separate (one per source file) object files (gtk-sep-obj.lib).
>
> Than main.d that imports gtk.Main is built using those libraries.
>
> Than zeroCount utils is built and launched over resulting files:
> --------------------------------------------------
> Now let's calculate zero bytes counts:
> --------------------------------------------------
>  Zero bytes|     %|    Non-zero| Total bytes|        File
>     3628311| 21.56|    13202153|    16830464|gtk-one-obj.lib
>     1953124| 15.98|    10272924|    12226048|gtk-sep-obj.lib
>   127968798| 99.00|     1298430|   129267228|main-one-obj.exe
>      743821| 37.51|     1239183|     1983004|main-sep-obj.exe
> Done.
>
> So we have to use very slow per-file build to produce a good (not 100 MiB) executable.
> No matter what *.exe is launched, its process allocates ~20MiB of RAM (loaded Gtk dll-s).

I believe this is bug 2254:

http://d.puremagic.com/issues/show_bug.cgi?id=2254

The cause is the way DMD builds libraries. The old way of building libraries (using a librarian) does not create libraries that exhibit this problem when linked with an executable.
December 20, 2011
On 12/20/11 2:58 PM, Marco Leise wrote:
> Am 19.12.2011, 19:08 Uhr, schrieb Walter Bright
> <newshound2@digitalmars.com>:
>
>> On 12/16/2011 2:55 PM, Walter Bright wrote:
>>> For example, in std.datetime there's "final class Clock". It inherits
>>> nothing,
>>> and nothing can be derived from it. The comments for it say it is
>>> merely a
>>> namespace. It should be a struct.
>>
>> Or perhaps it should be in its own module.
>
> When I first saw it I thought "That's how _Java_ goes about free
> functions: Make it a class." :)

Same here. If I had my way I'd rethink the name of those functions. Having a cutesy prefix "Clock." is hardly justifiable.

Andrei
December 21, 2011
Am 20.12.2011, 22:39 Uhr, schrieb Walter Bright <newshound2@digitalmars.com>:

> On 12/20/2011 1:07 PM, Marco Leise wrote:
>> +1. I didn't know about .bss, but static arrays of zeroes (global, struct,
>> class) increasing the executable size looked like a problem wanting a solution.
>> I hope it is easy to solve for dmd and is just an unimportant issue, so was
>> never implemented.
>
> I added a faq entry for this.

Ok, I jumped on the band wagon to early. Personally I only had this problem with classes and structs.

struct Test {
	byte arr[1024 * 1024 *10];
}

and

class Test {
	byte arr[1024 * 1024 *10];
}

both create a 10MB executable. While for the class, init may contain more data than just that one field, I don't see the struct adding anything or going into TLS. Can these initializers also go into .bss?