An interesting read: Scalable Computer Programming Languages (page 3) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » D » An interesting read: Scalable Computer Programming Languages (page 3)

July 28, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Bill Cox
in reply to Ilya Minkov

Bill Cox

Posted in reply to Ilya Minkov

Hi, Ilya.

Ilya Minkov wrote:
...

> In GC-enabled languages you definately cannot tune it by hand - that's
> what a language compiler is supposed to do since it's a trivial thing.

Always nice to read your comments, which are always thoughful.  However, I have to disagree with this statement.

In theory, a good GC system automates what I would build by hand otherwise.  I find that in practice, it doesn't quite get there.  GC systems of today typically don't even allocate objects of a given class in contiguous memory, which dramatically impacts cache and paging performance.

As an example of code that I wouldn't leave up to today's GCs, the inner loop of some printed circuit board and IC routers create "wave" objects to find minimum paths through mazes.  Once a minimum path has been found, all the waves get deleted at once.  Creation and deletion of waves can easily dominate a router's run-time if left up to the GC.  In practice, waves can be allocated in an array up front, and a wave creation can simply be an auto-increment on a global variable.  Deleting all the waves is done simply by assigning 0 to the global variable.

Of course, this can be done in D, and GC doesn't get in the way.

I also don't think that it's a trivial to do memory management well.  I look forward to the day that it's done so well that I can stop overiding GC.  At a minimum, the compiler needs to do a global analysis of my code, and preferably take some run-time statistics, before making choices for memory layout.  Then, it could also do cool automatic optimizations like inserting unions into my classes for fields that have non-overlapping lifetimes, which would be a maintainence nightmare to do by hand.  It could factor out the most frequently used fields of objects accessed in inner loops, and put those fields densly together in memory to enhance cache performance.  It might even insert the 'delete' statements for some classes automatically to eliminate GC overhead.  For infrequently accessed classes, it could ignore byte alignment of objects  and pack them more densly.

Languages like C++ are already broken in this reguard, since the exact layout of memory for each class is visible to the coder, and specified in the standard, and not the same on all machines.  D does a much better job of memory abstraction, which makes advanced optimizations possible.

Bill

July 28, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Ilya Minkov
in reply to Bill Cox

Ilya Minkov

Posted in reply to Bill Cox

Hello.

Bill Cox wrote:
> Always nice to read your comments, which are always thoughful. 

Thanks, but how'd i diserve that? :)

>  However, I have to disagree with this statement.

What I meant there, was selecting one of 3 allocation options in a standard Boehm GC, which is BTW not only used for its original purpose, but also by many runtime environments for GC-centric languages. I believe MONO and DOTGNU .NET-compatible runtimes do, Sather does, and i bet also lots of others.

Now, selecting one of its 3 most usual allocation options is trivial by object type, and would already do performance a lot of good. That's what is going to be also implemented in D, sort-of, just better.

> In theory, a good GC system automates what I would build by hand otherwise.  I find that in practice, it doesn't quite get there.  GC systems of today typically don't even allocate objects of a given class in contiguous memory, which dramatically impacts cache and paging performance.

They allocate same-sized objects in continuous memory. That's fairly close, but could be better.

> As an example of code that I wouldn't leave up to today's GCs, the inner loop of some printed circuit board and IC routers create "wave" objects to find minimum paths through mazes.  Once a minimum path has been found, all the waves get deleted at once.  Creation and deletion of waves can easily dominate a router's run-time if left up to the GC.  In practice, waves can be allocated in an array up front, and a wave creation can simply be an auto-increment on a global variable.  Deleting all the waves is done simply by assigning 0 to the global variable.

That's a pattern which definately doesn't requiere GC. C++ is very good at such things.

> Of course, this can be done in D, and GC doesn't get in the way.

In a way it does. It still needs to scan this area which would reduce total application performance. Or you shut it off and you're back at where you begun.

> I also don't think that it's a trivial to do memory management well.  I look forward to the day that it's done so well that I can stop overiding GC.  
[...]

Good memory management is no way trivial. Just that if completely automated memory management is a goal, GC scores good compared to other currently known solutions. As oppsed to that, a refcounter is a perfectly good help when memory management is largely manual.

I have stumbled over an experimental GC-enabled smartpointer implementation for C++, and it turns out to be significantly slower than a refcounter. Its major purpose is to be able to collect circular data. Its implementation doesn't only keep track of memory blocks (like the usual GC), but also of the smartpointers themselves.  When a refcounter would simply do an increment or a decrement and a test, it has to actually go and insert or delete this pointer into/from a set by traversing a binary tree! This means that slowness of such system grows logarithmically with the number of pointers to be tracked, while that of refcounting is constant. This GC is useless beyond very small scale. This shows, that a GC is currently a very inefficient supplement to a manual memory management system, while reference counting is efficient enough. The advantage of this implementation is that it doesn't need to scan the stack, which pays off when used very sparingly.

GC has to be used for the major part of the application to be able to match even mediocre manual memory management. It also appears that even if you generally use a conservative GC in C++, you can still exclude all the classes you manage manually from scanning. Which in turn means that a half-used GC would be of less performance impact than in D.

What you are speaking of, is basically that a compiler should be able to pick up and implement the best of the memory management strategy patterns, which currently belong to manual memory management (among other things). However, we're simply not there yet with our current knowedge. :(

> Languages like C++ are already broken in this reguard, since the exact layout of memory for each class is visible to the coder, and specified in the standard, and not the same on all machines.  D does a much better job of memory abstraction, which makes advanced optimizations possible.

Wait... In C++, you have to distinguish, just like in D, 2 kinds of classes: those which contain a VTable pointer and those which don't. Now, the standard doesn't specify, whether the VTable pointer has to be in the beginning (which is common though), at the end, or at any other position in a class storage. That means, that a layout of any class which has any virtual member function is implicitly broken and can be tuned by the compiler at will. Re-order, optimise, ... If i understood it correctly. But then again, for classes which have any predecessors, storage members have to retain the same layout among themselves as in the predecessor, as they compose a part of classe's interface. And that for an obvious performance reason.

Currently in D we have a similar situation. To the contrary, in Sather you cannot acess data members directly - only through functions, which behave like D's getters/setters. These are generated automatically if one doesn't write specialised ones. Classes are completely independant of the previous generation - and in fact this means that when using a class deep in hierarchy, it doesn't have to put all of its parents in the executable. Nor even it is generated completely, but only to a part which is actually inferred to be invokable. Funny thing: invokability scanner in Sather is conceptually and algorithmically very close to reachability scanner in garbage collection. It must be both very essential things then. :)

Another interesting feature in Sather is a separation between interfaces and concrete classes. Its transition to a D world could be a final class. A final class cannot be subclassed, but that's its strength - since usually calls to it can be made without VTable and possibly inlined. Just that i'm afraid that in D this would look like a kludge, while in Sather it's not even a constraint, since you can still reuse both its code and its interface. This also elegantly *implicitly* separates public from private details and leads to better designs. I believe that when doing full-program compilation, D shall be able to identify final classes by itself.

-i.

July 29, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Bill Cox
in reply to Ilya Minkov

Bill Cox

Posted in reply to Ilya Minkov

Hi, Ilya.

...

> Now, selecting one of its 3 most usual allocation options is trivial by object type, and would already do performance a lot of good. That's what is going to be also implemented in D, sort-of, just better.

Manual specification of just a few options would be great.

> Wait... In C++, you have to distinguish, just like in D, 2 kinds of classes: those which contain a VTable pointer and those which don't. Now, the standard doesn't specify, whether the VTable pointer has to be in the beginning (which is common though), at the end, or at any other position in a class storage. That means, that a layout of any class which has any virtual member function is implicitly broken and can be tuned by the compiler at will. Re-order, optimise, ... If i understood it correctly. But then again, for classes which have any predecessors, storage members have to retain the same layout among themselves as in the predecessor, as they compose a part of classe's interface. And that for an obvious performance reason.

I think C++ inherits the field layout of structures from C.  They are defined to be located sequentially in memory in the order declared in the structure or class.  The VTable pointer is less of a problem, I think.  D defines structures to have C compatable layout, but field sequence is undefined in a class.  In fact, I think in D you could spread fields out over memory, and not have them all sequentially, which can be handy for optimizing cache performance.

> Currently in D we have a similar situation. To the contrary, in Sather you cannot acess data members directly - only through functions, which behave like D's getters/setters. These are generated automatically if one doesn't write specialised ones. Classes are completely independant of the previous generation - and in fact this means that when using a class deep in hierarchy, it doesn't have to put all of its parents in the executable. Nor even it is generated completely, but only to a part which is actually inferred to be invokable. Funny thing: invokability scanner in Sather is conceptually and algorithmically very close to reachability scanner in garbage collection. It must be both very essential things then. :)
> 
> Another interesting feature in Sather is a separation between interfaces and concrete classes. Its transition to a D world could be a final class. A final class cannot be subclassed, but that's its strength - since usually calls to it can be made without VTable and possibly inlined. Just that i'm afraid that in D this would look like a kludge, while in Sather it's not even a constraint, since you can still reuse both its code and its interface. This also elegantly *implicitly* separates public from private details and leads to better designs. I believe that when doing full-program compilation, D shall be able to identify final classes by itself.
> 
> -i.

Sather does seem to get a lot right.  It's too bad Sather didn't use C-like syntax.  I think that hurt the language's popularity.

Bill

July 29, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Sean L. Palmer
in reply to Bill Cox

Sean L. Palmer

Posted in reply to Bill Cox

"Bill Cox" <bill@viasic.com> wrote in message news:3F2685D2.2040704@viasic.com...
> Sather does seem to get a lot right.  It's too bad Sather didn't use C-like syntax.  I think that hurt the language's popularity.

You're right.

I followed the link to Sather someone (probably Mark Evans) gave me with great interest.  I was astounded at each feature I read off the feature list, and each answer to the FAQ's made me shiver with anticipation of using this terrific language.  I was practically drooling by the time I got to the "sample code" section of the website, thinking surely I had finally found the language for me.

And the syntax made me immediately want to puke.

I haven't even bothered to try it.  ;)

It's not so bad that they used wordy syntax like Pascal, because I used to program Pascal.  I can deal with it, but I wouldn't like it.  But they *require* capitalization on some keywords, and force the way you capitalize idents.  That's just wrong.  That's such a hideous basic flaw, running so contrary to my way of programming, I could never convert.

Sean

July 29, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Ilya Minkov
in reply to Sean L. Palmer

Ilya Minkov

Posted in reply to Sean L. Palmer

Sean L. Palmer wrote:
> It's not so bad that they used wordy syntax like Pascal, because I used to
> program Pascal.  I can deal with it, but I wouldn't like it.  But they
> *require* capitalization on some keywords, and force the way you capitalize
> idents.  That's just wrong.  That's such a hideous basic flaw, running so
> contrary to my way of programming, I could never convert.

I got used to the aesthetics very fast.
 * almost all identifiers should be in a legible all_lowercase_letters style. Look, it reads almost like normal text! ThisIsMuchLessReadable. HowDidIWriteThatAbbreviation is a common problem with mixed-case and case-sensitive syntax - if you have an abbreviation like DOS, do you write it capitalised or in all-capitals? Try it with a few abbreviations, and you may find some you want to write in one style, and one in the other.
 * class names are the only ones requiered to be UPPERCASE. So, types are visually easy to track in the code. Surprisingly even to me, this doesn't look ugly, unlike uppercase C macros which do. You don't type them in each line anyway. You can even use type inference within functions if you're really sick of typing. :)
 * all keywords are lowercase.
 * operators are somewhat funny:
	~ logical not
	/= unequalilty
 * the single frankly not very beautiful thing is that iterators are named with en exclaimation mark. and possibly that you need to specify out parameters again when calling functions. well, it goes in line with Sather's explicit style.

That's no way a language flow. Personally, i find mixed-case identifiers which begin with a lowercase letter, like they are used in Java and here, awfully disguisting. This is so much unlike a natural language...

I'm by far not the only person who strongly dislikes C syntax.

http://www.csse.monash.edu.au/~damian/papers/PDF/ModestProposal.pdf

-i.

August 07, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Ilya Minkov
in reply to Bill Cox

Ilya Minkov

Posted in reply to Bill Cox

Hello.

It took me long to remember i wanted to answer this one yet. :)

Bill Cox wrote:

> Manual specification of just a few options would be great.

That's what i also thought of... But it's a complex problem. Where do
you want to specify, say, "nogc" -- for memory areas which are neither
scanned, nor allocated by GC? Either at the instantiation site, or in
the class itself... The problem is, you have to consider that if a nogc
object is a container -- whatever it holds should also be allocated in a
nogc mode, in a case it is immediately placed in this container. And
they may in turn be containers.

Can the decision be made at compile-time? Definately not in all cases
when nogc at instantiation is used. If specified in the class, it would
mean recursive creation of clone or sub-classes for contained types,
which would also be no-gc. Or is there another, multi-context solution?

> I think C++ inherits the field layout of structures from C.  They are
> defined to be located sequentially in memory in the order declared
> in the structure or class.  The VTable pointer is less of a problem,
> I think.  D defines structures to have C compatable layout, but field
> sequence is undefined in a class.  In fact, I think in D you could spread fields out over memory, and not have them all sequentially,
> which can be handy for optimizing cache performance.

You could also interleave C++ classes, with some minor risk, if all
their fields are private.

> Sather does seem to get a lot right.  It's too bad Sather didn't use
> C-like syntax.  I think that hurt the language's popularity.

I couldn't care less about popularity. :) Currently even its library is
more advanced than that of D. Besides, C-like syntax is evil. :) Not that i would ever propose to change D's syntax to Pascal all out of a sudden... D goes in a general direction which is inevitable.

What i do care about, is whether project is alive or not. And in the
case of Sather, it is dead since mid-2002.

-i.

August 17, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Walter
in reply to Achilleas Margaritis

Walter

Posted in reply to Achilleas Margaritis

"Achilleas Margaritis" <axilmar@b-online.gr> wrote in message news:bg0g9d$286f$1@digitaldaemon.com...
> GC is a mistake, in my opinion. I've never had memory leaks with C++,
since
> I always 'delete' what I 'new'.

The trouble I've had came when interfacing to a DLL that had memory leaks, and I had no way to change that DLL.

August 17, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Walter
in reply to Carlos Santander B.

Walter

Posted in reply to Carlos Santander B.

"Carlos Santander B." <carlos8294@msn.com> wrote in message news:bg0ili$2ahl$1@digitaldaemon.com...
> I had to build a program that solved the 8-puzzle in either C, C++ or
Java.
> I started in D and had it done quite quickly, but the teacher wouldn't accept (lisp freak, he said: if you want to do in another language, use lisp... I don't know that much lisp!), and since I didn't want Java, I re-did it in C++. While in D I could do it in a weekend because of its simplicity and power, in C++ it took my a whole week because of it being
way
> too complex.

Can you email me both versions? This sounds like a great example!

August 17, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Walter
in reply to Sean L. Palmer

Walter

Posted in reply to Sean L. Palmer

"Sean L. Palmer" <palmer.sean@verizon.net> wrote in message news:bg17pj$30u1$1@digitaldaemon.com...
> Manual memory management is what makes programming in C++ such a PITA.

I use a garbage collector with C++ now, too <g>.

August 17, 2003

Re: An interesting read: Scalable Computer Programming Languages

Posted by Walter
in reply to Achilleas Margaritis

Walter

Posted in reply to Achilleas Margaritis

Check out the book on garbage collection in www.digitalmars.com/bibliography.html

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation