December 31, 2012
> Of course your smart-array-pointer type needs to implement the ~ operator and create a copy of the array in that case. I guess the full implementation would include something that would resemble that of std::vector in C++.

std.container.Array already does that.
December 31, 2012
I think you're overthinking this way too much.

I'm writing a game engine, where latency is much more important
than in your case. Even 10ms would translate to visible jitter.
GC's not an issue, and disabling it permanently is very
counterproductive _unless_ you've exhausted all other options
(which you're unlikely to, as you can do everything you can do in
C++).  All you have to do is care about how you allocate and, if
GC seems to be an issue, profile to see _where_ the GC is being
called most and optimize those allocations.

Basic rules:

For classes with one or few instances (singletons, etc.), GC is
not an issue. For classes with hundreds-thousands of instances,
it might be an issue. Profile. For classes with more instances,
it probably is an issue. Profile, reuse instances, use structs,
manually allocate.  Arrays: To avoid reallocation on append, use
array.assumeSafeAppend() before the append (Assumes that the
array is not a slice of a bigger array that would get its data
overwritten).

If appending a lot, use std.array.Appender. If you need manually
allocated storage, use std.container.Array, or create a wrapper
around malloc/free. I'm using a templated wrapper that allows me
to record how much memory/instances was allocated with which
types.  Destructors: Structs are RAII like in C. And you can
easily create a malloc wrapper that will act exactly like
new/delete in C++.

Classes: call destroy(instance). Calls the dtor, but doesn't
deallocate the class (GC will do that later). Not much different
from deleting a class allocated through new in C++ (which is what
you do most of the time in C++).  If you absolutely need to free
the memory, create malloc/free wrappers for classes.  In my case,
classes are used for polymorphic stuff, which are usually single
or few-instance objects. Structs are used for most of the
many-instance objects (e.g. components of game entities), and are
usually stored in manually allocated arrays.

GC is _very_ useful for stuff like closures, which can greatly
simplify some code. Also, while I maintain that disabling GC
completely is a bad idea, it might be useful to disable it in a
small area of code _after_ profiling shows that GC is being
called too much there.

In my YAML parser library, I've found after profiling that 18% of
time was spent in GC. Most of that was in a small piece of code
that repeatedly allocated memory. Disabling GC before this code
and reenabling it after (scope(exit) to avoid leaking disabled
GC) resulted in GC taking ~2% of time, because many unnecessary
collects were consolidated into one after the GC was reenabled.
Memory usage didn't take a hit, because this was actually a
fairly small piece of code, not doing too many allocations per
call, just called very often.
December 31, 2012
Am 31.12.2012 13:14, schrieb Sven Over:
>
> A smart-pointer type for arrays can easily provide slices. It keeps a
> reference to the full array (which gets destructed when the last
> reference is dropped), but addresses a subrange.

I did exactly implement this in https://github.com/Ingrater/druntime/blob/master/src/core/refcounted.d
Basically I implemented a struct that mimics a regular D array as far as possible, but it is using reference counting under the hood.

Kind Regards
Benjamin Thaut
December 31, 2012
Am 31.12.2012 13:36, schrieb Sven Over:
>
> Hi Benjamin! I've seen your druntime and thBase repos on GitHub. Very
> interesting stuff. As I have little (or to be honest: no) experience in
> D it's difficult for me to judge how much work needs to be done. Is the
> idea to replace Phobos completely?

The idea is not to replace phobos completely. Currenly it only contains the functionality that I did need for my projects and that could not be used from phobos. std.traits, std.typetuple, and others are still used from phobos because they don't leak (or could be made non leaking with very small modifications).
thBase also contains a lot of functionality phobos does not have. E.g. containers, a better xml parser/writer, templates for automiatcally serializing and deserializing objects to/from xml files and much more.
Everything that I think could be usefull for other projects in the future gets added to thBase.

Knid Regards
Benjamin Thaut
December 31, 2012
On Monday, 31 December 2012 at 14:05:01 UTC, Kiith-Sa wrote:
> I think you're overthinking this way too much.
>
> I'm writing a game engine, where latency is much more important
> than in your case. Even 10ms would translate to visible jitter.
> GC's not an issue, and disabling it permanently is very
> counterproductive _unless_ you've exhausted all other options
> (which you're unlikely to, as you can do everything you can do in
> C++).  All you have to do is care about how you allocate and, if
> GC seems to be an issue, profile to see _where_ the GC is being
> called most and optimize those allocations.
>
> Basic rules:
>
> For classes with one or few instances (singletons, etc.), GC is
> not an issue. For classes with hundreds-thousands of instances,
> it might be an issue. Profile. For classes with more instances,
> it probably is an issue. Profile, reuse instances, use structs,
> manually allocate.  Arrays: To avoid reallocation on append, use
> array.assumeSafeAppend() before the append (Assumes that the
> array is not a slice of a bigger array that would get its data
> overwritten).
>
> If appending a lot, use std.array.Appender. If you need manually
> allocated storage, use std.container.Array, or create a wrapper
> around malloc/free. I'm using a templated wrapper that allows me
> to record how much memory/instances was allocated with which
> types.  Destructors: Structs are RAII like in C. And you can
> easily create a malloc wrapper that will act exactly like
> new/delete in C++.
>
> Classes: call destroy(instance). Calls the dtor, but doesn't
> deallocate the class (GC will do that later). Not much different
> from deleting a class allocated through new in C++ (which is what
> you do most of the time in C++).  If you absolutely need to free
> the memory, create malloc/free wrappers for classes.  In my case,
> classes are used for polymorphic stuff, which are usually single
> or few-instance objects. Structs are used for most of the
> many-instance objects (e.g. components of game entities), and are
> usually stored in manually allocated arrays.
>
> GC is _very_ useful for stuff like closures, which can greatly
> simplify some code. Also, while I maintain that disabling GC
> completely is a bad idea, it might be useful to disable it in a
> small area of code _after_ profiling shows that GC is being
> called too much there.
>
> In my YAML parser library, I've found after profiling that 18% of
> time was spent in GC. Most of that was in a small piece of code
> that repeatedly allocated memory. Disabling GC before this code
> and reenabling it after (scope(exit) to avoid leaking disabled
> GC) resulted in GC taking ~2% of time, because many unnecessary
> collects were consolidated into one after the GC was reenabled.
> Memory usage didn't take a hit, because this was actually a
> fairly small piece of code, not doing too many allocations per
> call, just called very often.

I think it will still take a few generations of programmers until everyone is confortable with using GC enabled languages for systems programming scenarios.

Most people nowadays are not aware of what has already been done in academia, or are burned by bad experiences caused by whatever reasons.

This situation will only improve when developers start aproaching the same class of problems with open minds on how to solve them in new ways.

--
Paulo
January 01, 2013
On 25/12/2012 14:13, Sven Over wrote:
<snip>
> Also, garbage collection often tends to increase the memory footprint of
> your software.

ISTM you can't really make a like-for-like comparison, since non-GC programs vary a lot in how much they leak memory, which becomes almost a non-issue when you have GC.

I say "almost" because GC systems often aren't perfect.  For instance, a GC might mistake some sequence of four or eight bytes for a pointer and keep hold of the allocated memory it is pointing to.  Java has been known to leak memory in certain circumstances, such as not disposing a window when you've finished with it, or creating a thread but never starting it.

But there is something called packratting, which is a mistake at the code level of keeping a pointer hanging around for longer than necessary and therefore preventing whatever it's pointing to from being GC'd.

> You may be able to avoid this with some expert knowledge
> about the garbage collector, but this of course invalidates one of the
> main arguments in favour of GC, which is less complexity. I'd say that
> if you want to write memory- and CPU-efficient code, then you'll need a
> certain amount of understanding of your memory management. Be it your
> manual memory management, or the inner workings of your garbage collector.
>
> To cut things short, even after giving it a lot of thought I still feel
> uncomfortable about garbage collection.

That means you're uncomfortable about reference-counted smart pointers, because these are a form of garbage collection. :)

> I'm wondering whether it is
> viable to use the same smart pointer techniques as in C++: is there an
> implementation already of such smart pointers? Can I switch off GC in my
> D programs altogether? How much does the standard library rely on GC?
> I.e. would I have to write a alternative standard library to do this?

Good luck at getting your alternative library accepted as part of the D standard. :)

Seriously though, it's as much built-in language functionality as the standard library that relies on GC.  Any of the following will allocate memory from the GC heap:

- create anything with the 'new' keyword
- increase the length of a dynamic array
- concatenate arrays
- duplicate an array with .dup or .idup
- put data into an associative array

Which means sooner or later the garbage will need to be collected, unless the program is such that all memory allocation will be done at startup and then released only on exit.  Of course, whether the program can reasonably be written in this way depends on the nature of the program.

> There is one blog post I found interesting to read in this context,
> which also underlines my worries: http://3d.benjamin-thaut.de/?p=20

An interesting set of observations.  Looking at this one:
"Calls to the druntime invariant handler are emitted in release build also and there is no way to turn them off. Even if the class does not have any invariants the invariant handler will always be called, walk the class hirarchy and generate multiple cache misses without actually doing anything."

There seem to be two bugs here: that these calls are generated even in release mode, and that they are generated even if the class has no invariants.  One of us needs to experiment with this a bit more and get these issues filed in Bugzilla.

Stewart.
January 01, 2013
On Monday, December 31, 2012 15:05:00 Kiith-Sa wrote:
> I think you're overthinking this way too much.
> 
> I'm writing a game engine, where latency is much more important than in your case. Even 10ms would translate to visible jitter. GC's not an issue, and disabling it permanently is very counterproductive _unless_ you've exhausted all other options (which you're unlikely to, as you can do everything you can do in C++).  All you have to do is care about how you allocate and, if GC seems to be an issue, profile to see _where_ the GC is being called most and optimize those allocations.
> 
> Basic rules:
[snip]

If you've got this much figured out with reasonably clear guidelines, you should consider writing a blog post or article about it.

- Jonathan M Davis
January 01, 2013
On Monday, 31 December 2012 at 12:14:22 UTC, Sven Over wrote:
> On Tuesday, 25 December 2012 at 19:23:59 UTC, Jonathan M Davis wrote:
>> There's also often no reason not to have the GC on and use it for certain stuff
>
> One thing that really freaks me out is the fact that the garbage collector pauses the whole process, i.e. all threads.
>
> In my job I'm writing backend services that power a big web site. Perfomance is key, as the response time of the data service in most cases directly adds to the page load time. The bare possibility that the whole service pauses for, say, 100ms is making me feel very uncomfortable.
>

I understand that. However, refcounted stuff tends to die in cluster as well and create pauses.

The main issue here is clearly GC's implementation rather than the concept of GC in itself (which can be quite good at avoiding pauses if you are ready to make some other tradeoffs).

> We easily achieve the performance and reliability we need in C++, but I would love to give D a chance, as it solves many inconveniences of C++ in an elegant way. Metaprogramming and the threading model, just to name two.
>

Here is something I tried in the past with some success : use RefCounted and GC.free . Doing so, you allocate in the GC heap, but you will limit greatly the amount of garbage that the GC have to collect by itself.

Note that in some cases, GC means greater performances (usually when associated with immutability), so disabling it entirely don't seems a good idea to me.
January 01, 2013
On Monday, 31 December 2012 at 14:43:27 UTC, pjmlp wrote:
> I think it will still take a few generations of programmers until everyone is confortable with using GC enabled languages for systems programming scenarios.
>
> Most people nowadays are not aware of what has already been done in academia, or are burned by bad experiences caused by whatever reasons.
>
> This situation will only improve when developers start aproaching the same class of problems with open minds on how to solve them in new ways.
>
> --
> Paulo

D's GC has some implementation issue as well.
January 02, 2013
 I'm interested in how the new LuaJIT GC ends up performing. But overall I can't say I have much hope for GC right now.

 GC/D = Generally Faster allocation. Has a cost associated with every living object.

 C++ = Generally Slower allocation, but while it is alive there is no cost.

 So as the heap grows, the GC language falls behind.

 This seems to be the case in every language I've looked at this uses a GC.