June 23, 2003
I think we all agree on that an expert on assembler programming can write code that is faster than anything written in any other language. And that interpreted code mostly is slower than e.g. C code. Still a lot of code is written in interpreted languages, and more code is written in C than assembler.

Of course the answer is programmer productivity. Writing in C gives you almost the same power as writing in assembler, but you get a lot more done in given time. The same holds true when comparing interpreted languages with C.

Some applications need to get more bang from the iron, and those are usually written in C or assembler in spite of the smaller programmer productivity.

Since in real-world programming the speed demands don't cover the entire program, only the most essential parts are usually written in assembler.

Additionally, one could discuss endlessly the practical results of individual programmers writing assembler, C or an interpreted language. It should not be too hard to show that differences in programmer's abilities have a higher impact on program speed than the chosen language -- even if the choice is between assembler and an interpreted language. (That is, a badly written bad algorithm in assembler can't catch up with a well written excellent algorithm in, say, Perl, Lua, or even GW-Basic.)

In light of all this, it seems marginal to discuss the merits of garbage collecting vs. explicit allocation.

What I am saying is, a good programmer with excellent experience with (say, D) should be able to write quite fast code in spite of GC. Also, the speed demands do not fall evenly on the program, as anyone who's done profiling knows. This lets the programmer make GC happen at suitable times and not happen when there's need for speed.

I think we all agree on that it would be very nice if both D and (Pascal, C, C++, ...) were to have garbage collectors that you can use in some parts of your program, and not use in other parts of the same program. And once the science of GC advances a little, we could use several different GCs in different parts of our program.

----

The above [ :-( ] became a (too) lenthy way of saying:

If the impact of GC were 10% in speed, and if the choice of programmer meant a 20% difference in speed, shouldn't we choose and educate our staff better, instead of splitting hairs over GC/no GC.

----

Still, I wouldn't feel comfortable writing the Linux kernel in D. Or a real-time system. But I sure wish I'll see the day when something convinces me I can do either or both in D with peace of mind.


June 23, 2003
Helmut Leitner wrote:

> GC gives up knowledge about the memory blocks needed or unneeded. To regain this knowledge has a cost. That's my theoretical basis why I think that GC will always (statistically) be slower than good tradition memory management.

Depends on whether tracking the knowledge of memory blocks is gaining you anything useful for the given context.  With GC you don't need to explicitly track and manage memory through the life of your algorithm. Not doing un-necessary work is a fairly traditional optimisation <g>

So if you have a problem that is particularly heavy on your memory manager, it may be that GC is a valid optimisation.  Of course, you could always write a custom allocator for that specific problem, but that holds for any problem.  I don't know about you, but I don't write custom allocators that often.  Switching to an alternate GC allocator (if provided) and measuring again would be a nice option to have available.

How does GC work in 'D'?

[I lurk and jump on intersting sounding threads, but don't currently have the time available to play with the language itself.  That's why its nice when threads wander off, they come back full circle to 'D' <g>]

-- 
AlisdairM
June 23, 2003
Georg Wrede wrote:

> Of course the answer is programmer productivity. Writing in C gives you almost the same power as writing in assembler, but you get a lot more done in given time. The same holds true when comparing interpreted languages with C.
> 
> Some applications need to get more bang from the iron, and those are usually written in C or assembler in spite of the smaller programmer productivity.

Productivity also plays a big part in outright speed of course, as it is easier to experiment and find the optimal algorithm in the more productive tool.  You can generally optimise a more complex form of algorithm tailored to the problem in said productive tool.

Once you have the final solution, you can probably realise an even more efficient implementation in the lower level tool.  Starting from cold, unless the high level language has a huge abstraction penalty or the problem is trivial/already well understood, you may well get a more efficient solution using the higher level language.

-- 
AlisdairM
June 23, 2003
> GC gives up knowledge about the memory blocks needed or unneeded. To regain this knowledge has a cost. That's my theoretical basis why I think that GC will always (statistically) be slower than good tradition memory management.
>
> But: I would never trust this theory.

That knowledge also has a cost - it doesn't come by itself. A traditional C memory manager doesn't just know where there are free memory blocks, it has to constantly update data structures to do so (and mostly data structures scattered over the whole of memory, which isn't exactly great in terms of cache usage etc.).

A program using manual memory management *does* obviously perform more free operations, so its memory footprint is relatively small, but then, a GCed program has the tendency to free memory in larger blocks, which reduces memory fragmentation, and thus, in the long term, memory usage. Also, GCed programs usually have less to worry about memory allocation issues, which makes the code overall simpler and shorter, thus saving both execution time and code space.

GCs have been shown to have better locality of reference in allocated memory blocks, which again improves cache usage. And both collections and malloc/free are definitely *not* free in terms of execution time. GC tends to clump memory management costs while malloc/free tend to spread them fairly evenly, but that by itself doesn't tell anything about the overall performance.

So the issue is hardly clear-cut :)

-fg


June 23, 2003
> Still, I wouldn't feel comfortable writing the Linux kernel in D. Or a real-time system. But I sure wish I'll see the day when something convinces me I can do either or both in D with peace of mind.

"Non-Compacting Memory Allocation and Real-Time Garbage Collection" Mark S. Johnstone, University of Texas, 1996 http://citeseer.nj.nec.com/255424.html

May atleast convince you of the possibility (and gritty details :)
of real-time garbage collection.

-fg


June 23, 2003
Alisdair Meredith wrote:

> How does GC work in 'D'?
> 
> [I lurk and jump on intersting sounding threads, but don't currently
> have the time available to play with the language itself.  That's why
> its nice when threads wander off, they come back full circle to 'D' <g>]

As far as i can remember from the rumors and some glazing at the source, the current GC works slowly. It's clearly a prototype and should be replaced by one improved by orders of magnitude - Walter has his plans and tons of ideas on it.

As of now, it appears to be a rather simple conservative mark-sweep-collector, similar yet in some cases slower than Boehm. It *does* allocate small structures of same size together in larger blocks though.

-i.

June 23, 2003
Ilya Minkov wrote:

> As far as i can remember from the rumors and some glazing at the source, the current GC works slowly. It's clearly a prototype and should be replaced by one improved by orders of magnitude - Walter has his plans and tons of ideas on it.

I asking more about how it is used.  Is it opt-in, opt-out or mandatory?  How does it interact with the RAII idiom, and things like std::auto_ptr?

-- 
AlisdairM
June 23, 2003
"Helmut Leitner" <helmut.leitner@chello.at> wrote in message news:3EF69401.F1EC2F81@chello.at...
> GC gives up knowledge about the memory blocks needed or unneeded. To regain this knowledge has a cost. That's my theoretical basis why I think that GC will always (statistically) be slower than good tradition memory management.

Many programs rarely ever need to free memory, gc can be a big win for them. Also, there is overhead involved in keeping track of who owns each chunk of memory.


June 23, 2003
Alisdair Meredith wrote:
> I asking more about how it is used.  Is it opt-in, opt-out or
> mandatory?  How does it interact with the RAII idiom, and things like
> std::auto_ptr?

std::auto_ptr??? Bwaaahahahaa! You're in the wrong newsgroup!

I suggest you read:
http://www.digitalmars.com/d/garbage.html
http://www.digitalmars.com/d/memory.html

Gargbage collection engine is automatically started with the application. Then, you just allocate objects as usual, and need not take care of their deallocation, which may happen whenever the runtime decides so. The distructors are called whenever actual deallocation takes place. You can also deallocate objects by hand if you like, and you can temporily disable garbage collection.

It has nothing to do with RAII at all, since RAII is conserned with allocation, not deallocation.

-i.


June 23, 2003
In article <bd692p$4f2$1@digitaldaemon.com>, Georg Wrede says...
>
>I think we all agree on that an expert on assembler programming can write code that is faster than anything written in any other language. And that interpreted code mostly is slower than e.g. C code. Still a lot of code is written in interpreted languages, and more code is written in C than assembler.
>
>Of course the answer is programmer productivity. Writing in C gives you almost the same power as writing in assembler, but you get a lot more done in given time. The same holds true when comparing interpreted languages with C.
>
>Some applications need to get more bang from the iron, and those are usually written in C or assembler in spite of the smaller programmer productivity.
>
>Since in real-world programming the speed demands don't cover the entire program, only the most essential parts are usually written in assembler.
>
>Additionally, one could discuss endlessly the practical results of individual programmers writing assembler, C or an interpreted language. It should not be too hard to show that differences in programmer's abilities have a higher impact on program speed than the chosen language -- even if the choice is between assembler and an interpreted language. (That is, a badly written bad algorithm in assembler can't catch up with a well written excellent algorithm in, say, Perl, Lua, or even GW-Basic.)
>
>In light of all this, it seems marginal to discuss the merits of garbage collecting vs. explicit allocation.
>
>What I am saying is, a good programmer with excellent experience with (say, D) should be able to write quite fast code in spite of GC. Also, the speed demands do not fall evenly on the program, as anyone who's done profiling knows. This lets the programmer make GC happen at suitable times and not happen when there's need for speed.
>
>I think we all agree on that it would be very nice if both D and (Pascal, C, C++, ...) were to have garbage collectors that you can use in some parts of your program, and not use in other parts of the same program. And once the science of GC advances a little, we could use several different GCs in different parts of our program.
>
>----
>
>The above [ :-( ] became a (too) lenthy way of saying:
>
>If the impact of GC were 10% in speed, and if the choice of programmer meant a 20% difference in speed, shouldn't we choose and educate our staff better, instead of splitting hairs over GC/no GC.
>
>----
>
>Still, I wouldn't feel comfortable writing the Linux kernel in D. Or a real-time system. But I sure wish I'll see the day when something convinces me I can do either or both in D with peace of mind.
>
>

Actually the Linux Kernel does use manual garbage collection.  It utilizes malloc to dynamically allocate system tables and perform kernel operations.  If everything was statically allocated it would be much more limited.  Garbage collection and memory management is important to an operating system.

Many of the tree functions and lists depend on cleaning up the memory regardless of whether it is garbage collected or not.  You could argue as in the case of C that C can generate better assembly than a programmer.  And that D could create a better garbage collector or memory manager than a programmer.

For extremely complex programs having the compiler manage the memory is much better.  The more complex architectures such as Itanium are much more difficult to write assembly in.  An assembly programmer would be hard pressed writing a parellel assembly program let alone a parellel garbage collector on the NEC Earth Simulator or an 8000 processor computer.