View mode: basic / threaded / horizontal-split · Log in · Help
January 07, 2013
Re: manual memory management
On Monday, 7 January 2013 at 17:19:25 UTC, Jonathan M Davis wrote:
> I don't think that any of the documentation or D's developers 
> have ever
> claimed that you could use the full language without the GC. 
> Quite the
> opposite in fact. There are a number of language features that 
> require the GC
> - including AAs, array concatenation, and closures.

True, there is some documentation describing that certain 
features require the use of the GC. Although I would say that the 
documentation needs to be made a lot more clear on this point. 
For example in the AA section there's no mention that the GC is 
required.

What you are saying is that while the GC is considered optional, 
it is not really optional given the language as a whole, only a 
(I assume large) subset of the language will work without the GC. 
In other words, the GC is partly optional.

I think we can do a lot better to make it more clear that the GC 
is not 100% optional, and also indicate clearly what features 
will not work without one.

>> You _can_
> program in D
> without the GC, but you lose features, and there's no way 
> around that. It may
> be the case that some features currently require the GC when 
> they shouldn't,
> but there are definitely features that _must_ have the GC and 
> _cannot_ be
> implemented otherwise (e.g. array concatenation and closures).

Is this a hard fact, or can there be a way to make it work? For 
example what about the custom allocator idea?

From a marketing POV, if the language can be made 100% free of 
the GC it would at least not be a deterrent to those who cannot 
accept having to use one. From a technical POV, there are 
definitely many situations where not using a GC is desirable.

--rt
January 07, 2013
Re: manual memory management
Rob T:

> What you are saying is that while the GC is considered 
> optional, it is not really optional given the language as a 
> whole, only a (I assume large) subset of the language will work 
> without the GC. In other words, the GC is partly optional.

Technical users get angry when they uncover some marketing lies 
in technical documentation. It's much better to tell them the 
truth since the beginning.

Bye,
bearophile
January 07, 2013
Re: manual memory management
On Mon, Jan 07, 2013 at 11:26:02PM +0100, Rob T wrote:
> On Monday, 7 January 2013 at 17:19:25 UTC, Jonathan M Davis wrote:
[...]
> >You _can_ program in D without the GC, but you lose features, and
> >there's no way around that. It may be the case that some features
> >currently require the GC when they shouldn't, but there are
> >definitely features that _must_ have the GC and _cannot_ be
> >implemented otherwise (e.g. array concatenation and closures).
> 
> Is this a hard fact, or can there be a way to make it work? For
> example what about the custom allocator idea?

Some features of D were *designed* with a GC in mind. As Jonathan has
already said, array slicing, concatenation, etc., pretty much *require*
a GC. I don't see how else you could implement code like this:

	int[] f(int[] arr) {
		assert(arr.length >= 4);
		return arr[2..4];
	}

	int[] g(int[] arr) {
		assert(arr.length >= 2);
		return arr[0..2];
	}

	int[] h(int[] arr) {
		assert(arr.length >= 3);
		if (arr[0] > 5)
			return arr[1..3];
		else
			return arr[2..3] ~ 6;
	}

	void main() {
		int[] arr = [1,2,3,4,5,6,7,8];
		auto a1 = f(arr[1..5]);
		auto a2 = g(arr[3..$]);
		auto a3 = h(arr[0..6]);
		a2 ~= 123;

		// Exercise for the reader: write manual deallocation
		// for this code.
	}

Yes, this code *can* be rewritten to use manual allocation, but it will
be a major pain in the neck (not to mention likely to be inefficient,
due to the required overhead of tracking where each array slice went and
whether a reallocation was needed and what must be freed at the end).

Not to mention that h() makes it impossible to do static analysis in the
compiler to keep track of what's going on (it will reallocate the array
or not depending on runtime data, for example). So you're pretty much
screwed if you don't have a GC.

To make it possible to do without the GC at the language level, you'd
have to basically cripple most of the main selling points of D arrays,
so that they become nothing more than C arrays with fancy syntax. Along
with all the nasty caveats that made C arrays (esp. strings) so painful
to work with. In particular, h() would require manual re-implementation
and major API change (it needs to somehow return a flag of some sort to
indicate whether or not the input array was reallocated), along with all
code that calls it (check for the flag, then decide based on where a
whole bunch of other pointers are pointing whether the input array needs
to be deallocated, etc., all the usual daily routine of a C programmer's
painful life). This cannot be feasibly automated, which means it can't
be done by the compiler, which means using D doesn't really give you any
advantage here, and therefore you might as well just write it in
straight C to begin with.


> From a marketing POV, if the language can be made 100% free of the GC
> it would at least not be a deterrent to those who cannot accept having
> to use one. From a technical POV, there are definitely many situations
> where not using a GC is desirable.
[...]

I think much of the aversion to GCs is misplaced.  I used to be very
aversive of GCs as well, so I totally understand where you're coming
from. I used to believe that GCs are for lazy programmers who can't be
bothered to think through their code and how to manage memory properly,
and that therefore GCs encourage sloppy coding. But then, after having
used D extensively for my personal projects, I discovered to my surprise
that having a GC actually *improved* the quality of my code -- it's much
more readable because I don't have to keep fiddling with pointers and
ownership (or worse, reference counts), and I can actually focus on how
to make the algorithms better. Not to mention the countless frustrating
hours spent chasing pointer bugs and memory leaks are all gone -- 'cos I
don't have to use pointers directly anymore.

As for performance, I have not noticed any significant performance
problems with using a GC in my D code. Now I know that there are cases
when the intermittent pause of the GC's mark-n-sweep cycle may not be
acceptable, but I suspect that 90% of applications don't even need to
care about this. Most applications won't even have any noticeable
pauses.

The most prominent case where this *does* matter is in game engines,
that must squeeze out every last drop of performance from the hardware,
no matter what. But then, when you're coding a game engine, you aren't
writing general application code per se; you're engineering a
highly-polished and meticulously-tuned codebase where all data
structures are already carefully controlled and mapped out -- IOW, you
wouldn't be using GC-dependent features of D in this code anyway. So it
shouldn't even be a problem.

The problem case comes when you have to interface this highly-optimized
core with application-level code, like in-game scripting or what-not. I
see a lot of advantages in separating out the scripting engine into a
separate process from the high-performance video/whatever-handling code,
so you can have the GC merrily doing its thing in the scripting engine
(targeted for script writers, level designers, who aren't into doing
pointer arithmetic in order to get the highest polygon rates from the
video hardware), without affecting the GC-independent core at all. So
you get the best of both worlds.

Crippling the language to cater to the 10% crowd who want to squeeze
every last drop of performance from the hardware is the wrong approach
IMO.


T

-- 
"Life is all a great joke, but only the brave ever get the point." -- Kenneth Rexroth
January 08, 2013
Re: manual memory management
Yes I can see in your example why removing the GC fully will be 
difficult to deal with.

I am not actually against the use of the GC, I was only wondering 
if it could be fully removed. I too did not at first agree with 
the GC concept, thinking the same things you mention. I still 
have to consider performance issues caused by the GC, but the 
advantage is that I can do things that before I would not even 
bother attempting because the cost was too high. The way I 
program has changed for the better, there's no doubt about it.

So if the GC cannot be removed fully, then there's no point 
trying to fully remove it, and performance issues have to be 
solved through improving the GC implementation, and also with 
better selective manual control methods.

As for the claims made that D's GC is "optional", that message is 
coming from various sources one encounters when reading about D 
for the first time.

For example:
http://www.drdobbs.com/tools/new-native-languages/232901643
"D has grown to embrace a wide range of features — optional 
memory management (garbage collection), ..."

Sure you can "optionally" disable the GC, but it means certain 
fundamental parts of the language will no longer be usable, 
leading to misconceptions that the GC is fully optional and 
everything can be made to work as before.

I know D's documentation is *not* claiming that the GC is 
optional, you get that impression from reading external sources 
instead, however it may be a good idea to counter the possible 
misconception in the FAQ.

Improved documentation will also help those who want to do 
selective manual memory management. As it is, I cannot say for 
certain what parts of the language require the use of the GC 
because the specification either leaves this information out, or 
is not specified clearly enough.

--rt
January 08, 2013
Re: manual memory management
On Tue, 8 Jan 2013, Rob T wrote:

> I am not actually against the use of the GC, I was only wondering if it could
> be fully removed. I too did not at first agree with the GC concept, thinking
> the same things you mention. I still have to consider performance issues
> caused by the GC, but the advantage is that I can do things that before I
> would not even bother attempting because the cost was too high. The way I
> program has changed for the better, there's no doubt about it.

There's some issues that can rightfully be termed "caused by the GC", but 
most of the performance issues are probably better labled "agregious use 
of short lived allocations", which cost performance regardless of how 
memory is managed.  The key difference being that in manual management the 
impact is spread out and in periodic garbage collection it's batched up.

My primary point being, blaming the GC when it's the application style 
that generates enough garbage to result in wanting to blame the GC for the 
performance cost is misplaced blame.

My 2 cents,
Brad
January 08, 2013
Re: manual memory management
On Tuesday, 8 January 2013 at 02:06:02 UTC, Brad Roberts wrote:
> On Tue, 8 Jan 2013, Rob T wrote:
>
>> I am not actually against the use of the GC, I was only 
>> wondering if it could
>> be fully removed. I too did not at first agree with the GC 
>> concept, thinking
>> the same things you mention. I still have to consider 
>> performance issues
>> caused by the GC, but the advantage is that I can do things 
>> that before I
>> would not even bother attempting because the cost was too 
>> high. The way I
>> program has changed for the better, there's no doubt about it.
>
> There's some issues that can rightfully be termed "caused by 
> the GC", but
> most of the performance issues are probably better labled 
> "agregious use
> of short lived allocations", which cost performance regardless 
> of how
> memory is managed.  The key difference being that in manual 
> management the
> impact is spread out and in periodic garbage collection it's 
> batched up.
>
> My primary point being, blaming the GC when it's the 
> application style
> that generates enough garbage to result in wanting to blame the 
> GC for the
> performance cost is misplaced blame.
>
> My 2 cents,
> Brad

You'll also find out that D's GC is kind of slow, but this is an 
implementation issue more than a conceptual problem with he GC.
January 08, 2013
Re: manual memory management
On Tue, Jan 08, 2013 at 02:57:31AM +0100, Rob T wrote:
[...]
> So if the GC cannot be removed fully, then there's no point trying
> to fully remove it, and performance issues have to be solved through
> improving the GC implementation, and also with better selective
> manual control methods.

I know people *have* tried to use D without GC-dependent features; it
would be great if this information can be collected in one place and put
into the official docs. That way, people who are writing game engines or
real-time code know what to do, and the other 90% of coders can just
continue using D as before.


> As for the claims made that D's GC is "optional", that message is
> coming from various sources one encounters when reading about D for
> the first time.
> 
> For example:
> http://www.drdobbs.com/tools/new-native-languages/232901643
> "D has grown to embrace a wide range of features — optional memory
> management (garbage collection), ..."
> 
> Sure you can "optionally" disable the GC, but it means certain
> fundamental parts of the language will no longer be usable, leading
> to misconceptions that the GC is fully optional and everything can
> be made to work as before.

Does Dr. Dobbs allow revisions to previously published articles? If not,
the best we can do is to update our own docs to address this issue.


> I know D's documentation is *not* claiming that the GC is optional,
> you get that impression from reading external sources instead, however
> it may be a good idea to counter the possible misconception in the
> FAQ.

Yeah that's a good idea.


> Improved documentation will also help those who want to do selective
> manual memory management. As it is, I cannot say for certain what
> parts of the language require the use of the GC because the
> specification either leaves this information out, or is not specified
> clearly enough.
[...]

I don't know if I know them all, but certainly the following are
GC-dependent:

- Slicing/appending arrays (which includes a number of string
 operations), .dup, .idup;
- Delegates & anything requiring access to local variables after the
 containing scope has exited;
- Built-in AA's;
- Classes (though I believe it's possible to manually manage memory for
 classes via Phobos' emplace), including exceptions (IIRC);
- std.container (IIRC Andrei was supposed to work on an allocator model
 for it so that it's usable without a GC)

AFAIK, the range-related code in Phobos has been under scrutiny to
contain no hidden allocations (hence the use of structs instead of
classes for various range constructs). So unless there are bugs,
std.range and std.algorithm should be safe to use without involving the
GC.

Static arrays are GC-free, and so are array literals (I *think*) as long
as you don't do any memory-related operation on them like appending or
.dup'ing. So strings should be still somewhat usable, though quite
limited. I don't know if std.format (including writefln & friends)
invoke the GC -- I think they do, under the hood. So writefln may not be
usable, or maybe it's just certain format strings that can't be used,
and if you're careful you may be able to pull it off without touching
the GC.

AA literals are NOT safe, though -- anything to do with built-in AA's
will involve the GC. (I have an idea that may make AA literals usable
without runtime allocation -- but CTFE is still somewhat limited right
now so my implementation doesn't quite work yet.)

But yeah, it would be nice if the official docs can indicate which
features are GC-dependent.


T

-- 
Latin's a dead language, as dead as can be; it killed off all the
Romans, and now it's killing me! -- Schoolboy
January 08, 2013
Re: manual memory management
On Tuesday, 8 January 2013 at 02:06:02 UTC, Brad Roberts wrote:
> There's some issues that can rightfully be termed "caused by 
> the GC", but
> most of the performance issues are probably better labled 
> "agregious use
> of short lived allocations", which cost performance regardless 
> of how
> memory is managed.  The key difference being that in manual 
> management the
> impact is spread out and in periodic garbage collection it's 
> batched up.
>
> My primary point being, blaming the GC when it's the 
> application style
> that generates enough garbage to result in wanting to blame the 
> GC for the
> performance cost is misplaced blame.
>
> My 2 cents,
> Brad

There's more to it than just jerkiness caused by batching. The GC 
will do collection runs at inappropriate times, and that can 
cause slow downs well in excess of an otherwise identical 
application with manual memory management. For example, I've seen 
3x performance penalty caused by the GC doing collection runs at 
the wrong times. The fix required manually disabling the GC 
during certain points and re-enabling afterwards.

The 2 or 3 lines of extra code I inserted to fix the 3x 
performance penalty was a lot easier than performing full manual 
management, but it means that you cannot sit back and expect the 
GC to always do the right thing.

--rt
January 08, 2013
Re: manual memory management
On Monday, 7 January 2013 at 23:13:13 UTC, H. S. Teoh wrote:
> ...
>
> Crippling the language to cater to the 10% crowd who want to 
> squeeze
> every last drop of performance from the hardware is the wrong 
> approach
> IMO.
>
>
> T

Agreed.

Having used GC languages for the last decade, I think the cases 
where manual memory management is really required are very few.

Even if one is forced to do manual memory management over GC, it 
is still better to have the GC around than do everything manually.

But this is based on my experience doing business applications, 
desktop and server side or services/daemons.

Other's experience may vary.

--
Paulo
January 08, 2013
Re: manual memory management
On Monday, 7 January 2013 at 17:19:25 UTC, Jonathan M Davis wrote:
> On Monday, January 07, 2013 17:55:35 Rob T wrote:
>> On Monday, 7 January 2013 at 16:12:22 UTC, mist wrote:
>> > How is D manual memory management any worse than plain C one?
>> > Plenty of language features depend on GC but stuff that is 
>> > left
>> > can hardly be named "a lousy excuse". It lacks some 
>> > convenience
>> > and guidelines based on practical experience but it is 
>> > already
>> > as capable as some of wide-spread solutions for systems
>> > programming (C). In fact I'd be much more afraid of runtime
>> > issues when doing system stuff than GC ones.
>> 
>> I think the point being made was that built in language 
>> features
>> should not be dependent on the need for a GC because it means
>> that you cannot fully use the language without a GC present and
>> active. We can perhaps excuse the std library, but certainly 
>> not
>> the language itself, because the claim is made that D's GC is
>> fully optional.
>
> I don't think that any of the documentation or D's developers 
> have ever
> claimed that you could use the full language without the GC. 
> Quite the
> opposite in fact. There are a number of language features that 
> require the GC
> - including AAs, array concatenation, and closures. You _can_ 
> program in D
> without the GC, but you lose features, and there's no way 
> around that. It may
> be the case that some features currently require the GC when 
> they shouldn't,
> but there are definitely features that _must_ have the GC and 
> _cannot_ be
> implemented otherwise (e.g. array concatenation and closures). 
> So, if you want
> to ditch the GC completely, it comes at a cost, and AFAIK no 
> one around here
> is saying otherwise. You _can_ do it though if you really want 
> to.
>
> In general however, the best approach if you want to minimize 
> GC involvement
> is to generally use manual memory management and minimize your 
> usage of
> features that require the GC rather than try and get rid of it 
> entirely,
> because going the extra mile to remove its use completely 
> generally just isn't
> worth it. Kith-Sa posted some good advice on this just the 
> other day, and he's
> written a game engine in D:
>
> http://forum.dlang.org/post/vbsajlgotanuhmmpnspf@forum.dlang.org
>
> - Jonathan M Davis

Just speaking as a bystander but I believe it is becoming 
apparent that a good guide to using D without the GC is required. 
We have a growing number of users who could be useful converts 
doing things like using it as a game engine, giving some general 
help with approaches and warnings about what does and doesn't 
require the GC would greatly smooth the process. Sadly I lack the 
talent to write such a guide.
1 2 3 4 5 6
Top | Discussion index | About this forum | D home