July 27, 2003
And since we are talking about libraries, D will make it only if it has standard libraries for gui, networking, database etc. Otherwise, it will be like C++: a technically superior language, but not the language of choice.



July 27, 2003
"Achilleas Margaritis" <axilmar@b-online.gr> wrote in message
news:bg0g9d$286f$1@digitaldaemon.com...
|
| All the other languages are well on a theoritical basis. The only problem
I
| see with C++ is the lack of standard (and free!!!) libraries across
| different operating systems, especially for the UI.
|

I'm sorry if I stick my nose where I shouldn't, but...

I had to build a program that solved the 8-puzzle in either C, C++ or Java. I started in D and had it done quite quickly, but the teacher wouldn't accept (lisp freak, he said: if you want to do in another language, use lisp... I don't know that much lisp!), and since I didn't want Java, I re-did it in C++. While in D I could do it in a weekend because of its simplicity and power, in C++ it took my a whole week because of it being way too complex.

I must say I learnt many things about C++ that I was never thought, and the program turned out really good (in DMC it works like a charm, in BCC and VC6, not so much) but sometimes I was in hell. The last problem I had was allocating memory for a char *. That can't be good.

What I'm trying to say is C++ gives you a lot of power, yes, but sometimes its complexity can be... overwhelming, let's say.

(Take all what I've from someone who realized this week that knowing *some* Turbo C++ 3.0 isn't knowing C++)

————————————————————————— Carlos Santander


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.504 / Virus Database: 302 - Release Date: 2003-07-24


July 27, 2003
Good for you!

You can be the most conscientious programmer in the world, but it is still rather easy to get into a situation where it is not clear where or when the object should be deleted.

With C++/manual memory management, you either get dangling pointers (which leads to double-deletions or access to invalid objects), or memory leaks. It takes either a very simple program logic, or a superhuman effort by the programmers, to avoid all the memory problems.

Just because you haven't personally run into them doesn't make the problems go away for everybody else.  It is a real issue.

The sucky thing is, that the job of figuring out when it's safe to delete the objects, of keeping the "reference counts" or GC, or what-have-you, is a rather straightforward chore, something the computer could do for you.  This is why we all want memory management built into the language, so we don't have to worry about forgetting to delete pointers, or deleting them too early, or forgetting to clean all pointers to the deleted memory.

Manual memory management is what makes programming in C++ such a PITA.

Sean

"Achilleas Margaritis" <axilmar@b-online.gr> wrote in message news:bg0g9d$286f$1@digitaldaemon.com...
> GC is a mistake, in my opinion. I've never had memory leaks with C++,
since
> I always 'delete' what I 'new'.


July 27, 2003
"Achilleas Margaritis" <axilmar@b-online.gr> a écrit dans le message news: bg0g9d$286f$1@digitaldaemon.com...

> GC is a mistake, in my opinion. I've never had memory leaks with C++,
since
> I always 'delete' what I 'new'.

Is it realy that simple ? I dont think so ! =)

> And how does the GC marks an object as unreachable ? it has to count how many pointers track it. Otherwise, it does not know how many references
are
> there to it. So, it means reference counting, in reality.

*root references* are found at compile-time. Thus, at run-time, GC start from roots and jump down from reference to reference, marking each chunk as reachable. Finally, unreachable chunks are deleted. It's the "mark & sweep" algorithm.

> If it does not use any way of reference counting as you imply, it has
first
> to reset the 'reachable' flag for every object, then scan pointers and set the 'reachable' flag for those objects that they have pointers that point
to
> them. And I am asking you, how is that more efficient than simple
reference
> counting (which is local, i.e. only when a new pointer is
created/destroyed,
> the actual reference counter integer is affected).

Some points:
- Reference counting can't handle cycle references, and that's a *big*
problem.
- GC collection only occur when program run out of memory, reference
counting waste cycles on each assignment.
- Reference counting dont compact heap.

[snip]
> Of course, you may say now that some call may destroy the object and leave the stack pointers dangling. And I will say to you, that it's your algorithm's fault, not of the library's: since the inner call destroyed
the
> object, it was not supposed to be accessed afterwards.

Well, if you fail coding big application in asm, that's your algorithm's fault too =). But language & compiler are supposed to help us writing programs. C/C++ dont realy help for memory managment.

> So, as you can see, automated refcounting works like a breeze. And you
also
> get the benefit of determinism: you know when destructors are called; and then, you can have stack objects that, when destroyed, do away with all
the
> side effects (for example, a File object closes the file automatically
when
> destroyed).
>
> >
> > Thus, it turns out that "total" GC is significantly less overhead than "total" reference counting.
>
> Nope, it does not, as I have demonstrated above.

im not convinced =)

-- Nicolas Repiquet


July 27, 2003
This is a good article by an open-minded person.  The response on D news has been entirely predictable, though - our way or the highway, one might say.

Of similar interest is the recent "hackers and painters" thread at LL-discuss where Michael Vanier and several luminaries weigh in -- Paul Graham, Todd Proebsting of MS Research, Neel Krishnaswami designer of Needle, and many others.

D news
http://www.digitalmars.com/drn-bin/wwwnews?D
LL
http://www.ai.mit.edu/~gregs/ll1-discuss-archive-html/threads.html#03074


July 27, 2003
"Mark Evans" <Mark_member@pathlink.com> wrote in message news:bg19ge$187$1@digitaldaemon.com...
>
> This is a good article by an open-minded person.  The response on D news
has
> been entirely predictable, though - our way or the highway, one might say.
>

The debate has certainly been interesting though. I have learned allot from reading the comments posted thus far.

Thanks for the links!

Andrew


July 27, 2003
Achilleas Margaritis wrote:

> But if it is a thread, it means that for every pointer that it can be
>  accessed by the GC, it has to provide synchronization. Which in turn, means, provide a mutex locking for each pointer.

Nope. If GC scans a thread it freeses it *completely* - mutexing would
actually be intolerable performance-wise.

And please: the GC which you get with a Java VM is crap. There has been
done a lot of research to improve it, but Sun doesn't seem to be interested.

Most obviously, a threaded GC is bad for a single-threaded or
mostly-single-threaded environments like D or C++. However, a free
routine should be an low-priority thread.

> GC is a mistake, in my opinion. I've never had memory leaks with C++,
>  since I always 'delete' what I 'new'.

That's what good code structuring in C++ allowes for. And i believe the
author of the original article (actually a rant) just understood that
recently. C++ takes a few thousand pages of reading to understand.

> But if you have to hand-tune the allocation type, it breaks the promise of ''just only allocate the objects you want, and forget about everything else". And this "hand-tuning" that you are saying is
>  a tough nut to crack. For example, a lot of code goes into our Java applications for reusing the objects. Well, If I have to make such a big effort to "hand-tune", I better take over memory allocation and delete the objects myself.

In GC-enabled languages you definately cannot tune it by hand - that's
what a language compiler is supposed to do since it's a trivial thing.

In C (and maybe C++) you can consider 3 cases:
 - you allocate very usual storage - lika a class - which contins
pointers and can contain them everywhere.
 - you allocate an array of data, which definately doesn't have to be
scanned. It can be textures or something along these lines.
 - you allocate storage which begins with maybe pointers, then an array
with no pointers. An exmaple of that is a struct with an open-ended
array at its end - strings and other stuff is often implemented this way
in C.

> It can't be using a stack, since a stack is a LIFO thing. Pointers can be nullified in any order. Are you saying that each 'pointer' is allocated from a special area in memory ? if it is so, what happens with member pointers ? what is their implementation in reality ? Is a
>  member pointer a pointer to a pointer in reality ? if it is so, it's
>  bad. Really bad.

ARGH!!!! I'M NOT GOING TO EXPLAIN YOU EACH AND EVERY BIT WHAT YOU DON'T
UNDERSATAND!!! TAKE A SANE BOOK! It explains better than myself and
doesn't get impatient.

 * a usual allocation procedure is replaced by that of a GC. GC
allocates memory on a usual heap -- or sometimes in preallocated buffers
but that's for performance only.
 * when allocating memory, it stores beginning and end of an allocated
block in an efficiently searchable structure, so that additional
information can be associated with it (destructor, reachability, and so on).

So, you are dealing with completely usual pointers, just that GC needs
to be informed of every allocation.

And FYI, program execution stack is just an array of values, which are
in case of 32-bit CPU 4 bytes large. In the beginnng of a program, you
can make a local integer variable and take its adress. This would be
almost the beginning of a stack. Then, at some later point, preferably
deep within nested function calls, make another local variable, and take
its adress as well. This range denotes an array which would be a vital
prt of your application's stack. It's that simple.

Every value in this array is treated as a potential pointer. That's how
Boehm GC works, language-specific GCs can use more efficient strategies.
D's GC is currently like Boehm GC except that it's slower. Both incur
*no* runtime cost, a tiny allocation overhead which disappears due to
optimisations, and a relatively long collection phase, which stops an
application.

There also exist so-called 3 color collectors, which incur some constant
run-time slowdown, but don't have to stop an application at all. That's
a kind of GC used in OCaml.

> And how does the GC marks an object as unreachable ? it has to count how many pointers track it. Otherwise, it does not know how many references are there to it. So, it means reference counting, in reality.

IT DOESN'T DO REFERENCE COUNTING!!! YOU SEEM NOT TO READ WHAT I WRITE!
Please re-read everything, and pinpoint what you don't understand. Or
CONSULT LITERATURE.

You don't even google before you rant off!
http://www.iecc.com/gclist/GC-faq.html

GC DOESN'T MARK ANYTHING AS *UN*REACHABLE!
GC DOESN'T MARK ANY POINTERS AT ALL!!!

GC traverses every pointer in every memory region.
Memory regions, which it stumbles over, become marked as *reachable*,
and the rest will be freed.

To have a starting point, stack and registers are scanned first.

There is a number of optimisations, which reduce scanning overhead:
 - Following information is used:
   - A number, which isn't multiple of 4, cannot be a pointer into
beginning of a memory range. This is enforced by an allocator.
   - There are adress range regions, which cannot contain anything
useful. These are sorted out very fast.

Besides, languages built around a GC can reduce the cost of scan phase
by collecting additional information during the program run.
Applications become slower in total, but you get rid of this nasty pauses.

> If it does not use any way of reference counting as you imply, it has
>  first to reset the 'reachable' flag for every object, then scan pointers and set the 'reachable' flag for those objects that they have pointers that point to them.

Substantially correct. Just that these scans happen once in a while,
thus amortising the cost. It is obvious, that a program which doesn't do
any memory management at all is faster than the one which does - given
it gets all the memory which it needs. In a theoretical case of an
unlimited memory, it's the same with GC. Just GC waits till memory is
full, and then it kicks in to clean some up.

So, during this time in which GC doesn't kick in, a total refcounter
would collect and discard all the information that the GC collects once
for a great number of times.

> And I am asking you, how is that more efficient than simple reference
> counting (which is local, i.e. only when a new pointer is
> created/destroyed, the actual reference counter integer is affected).
> 
If refcounting is embedded into your object, it's rather efficient, if
it's done through a smartpointer it's not - and very bad in terms of space.

Usually, not using reference counting is faster, and is the common and
safe practice in C++ programs. BTW, that's also what you are saying.

>> Thus, it turns out that "total" GC is significantly less overhead than "total" reference counting.
> 
> Nope, it does not, as I have demonstrated above.

Your (and common) C++ practice, is fairly efficient, and is basically
manual memory management with some automated help sprinkled in. However,
D has completely automatic memory management by design, and refcounting
everything would be much slower that GCing everything. I can clearly see
that manual + refcounted memory management is the right thing for C++.

> So, as you can see, automated refcounting works like a breeze. And you also get the benefit of determinism: you know when destructors are called; and then, you can have stack objects that, when destroyed, do away with all the side effects (for example, a File object closes the file automatically when destroyed).

This is definately a good thing.

> If the working set is not in the cache, it means a lot of cache misses, thus a slow program. Refcounting only gives 4 bytes extra to each object. If you really want to know when to delete an object, I'll tell you the right moment: when it is no more referenced. And how do you achieve that ? with refcounting.

> As I told earlier, the trick is to use refcounting where it must be used. In other words, not for pointers allocated on the stack.

You cannot tell that to an automated memory management system. :)

> Real-life programming languages only, please. You still don't give me
>  an example of how initialization fails with aliasing.

What are you doing here? Is D a real-life language? Definately not yet.
And Sather is also quite a good candidate. There's neither anything
unnatural nor complicated in it. Just amazingly simple and good
sloutions. Something that may be called elegance, if you like, and which
is foreign to any C descendant. :)

I don't think i can follow you. What does initialisation have to do with
aliasing? Efficiency (compined with simplicity) is the reason why
unaliased objects exist.

> At first I thought too that ADA was similar to PASCAL. Well, it is syntactically similar, but that's about it. It's pointer usage is constrained. For example, you can do pointer arithmetic, but it is bounds-checked. You can't have pointer casting, unless it is explicitely specified as an alias on the stack.

Though these are nice, i *still* don't see how these reduce memory
leaks. There must be more to it.


> A cleverer solution would be to have automatic type insertion from the IDE: when I type
> 
> 'x = 0.0',
> 
> the IDE converts it to:
> 
> 'double x = 0.0'.
> 
> After all, it's a typing problem, right ? we are frustrated to type the things that the computer should understand by itself. But that does not have to do about what the program should be like.

Cool idea.

> Here is a little thought about Java's lack of templates, which is related to the problem of going back to the code and instantly realizing what's happenning:

Argh. My Delphi practice says: don't use containers as they are! You can
subclass a container, and hide all the casts there. This is no more than
a screenful of code -- and a significant help onwards. I don't want to
justify Java, but it's a sort-of solution.

> This is way explicit statement of types is very important. We should not mix the 'fast typing' problem with the actual programming language.

Yup. Why doesn't anyone use Java extentions - Pizza, Nice, Kiev -- they
all have templates, properties, and othar things which make Java much
easier and better.

> Nope, but I don't have any memory leaks in my apps, except only when I forget to delete things. But that's my problem. It's an engineering
>  problem, not a language problem.

HAHAHAHAA! Don't you want to be free of this headache at all? In fact,
it's such a common "engineering problem", that it's only natural to
search for a solution to it in the language. And yet: there are many
people, which may be well educated, profesional, and so on, but they
just can't keep tack of too many things in their head. They deserve
help. And you can free up your brain for better things, which doesn't
happen with Java because it's a flawed thing.

Now scroll up and read your own quote: "I've never had memory leaks with
C++, since I always 'delete' what I 'new'"

Assembly language is very error-prone -- and yet, acording to you it
would be "an engineering problem, not a language problem"! Same to a
lesser extent to C, C++, and so on.

> Nope, it isn't. C++ is the only language that cuts it for me:
> 
> 1) you always know what is happening. It is deterministic.

It's not very verbose as to what's happening. Requieres some brain strain.

> 2) it has quite straightforward syntax, unlike ADA.

Straightforward syntax? Don't make me laugh.

> 3) supports generics in the best way I have seen (except D of course
> :-) ). This is very important.

Sather has generics integrated with their typesystem so tightly, that
any implementation class is a template as well.

I must confess, C++ templates really impress me each time for new.
First, it turns out that sorting through algorithms is 10x faster than
with C's qsort! Then i find dynamic closures, constructed using
templates, and similar wizardry -- even a complete functional
programming emulation -- something for which they weren't thought.

> 4) lot's of things can be automated, including memory management.

Most importantly, it is very flexible as what you automate and what not:
you may use pools, GCs, refcounts, and lots of other stuff.

5) supports every programming technique and paradigm God
> knows.

C++ is very poorly suited to aspect-oriented programming. Well, like
most other languages out there. I believe only Lisp and its descendants,
as well as some very special languages are good at it.

Sather is a pure OO language, and yet it has closures which allow to do
functional programming in it.

> ADA is too strict, Java sucks, Basic is good for only small projects and prototyping. I also have knowledge of ML(and Haskell), although I
>  would not say that it is a programming language to build large applications with.

That's all of your language baggage? C'mon! Forget basic, and take a
deep breath of something smarter. Java was probably a marketing gag to
help microsoft market VB. :) ML and Haskell are very limited lab toys,
even unlike OCaml.

What do you mean with "ADA is too strict"?

I dislike Eiffel because it becomes PITA in quite a number of situations
in which even Sather doesn't. "Exception means a broken program", silly
loop conditions, old contract hunting you in 10th generation, and so on.
 It is so strict about safety, that trying to write software which
matches its criteria becomes by itself an unsafe thing to do. :)

> All the other languages are well on a theoritical basis. The only problem I see with C++ is the lack of standard (and free!!!) libraries across different operating systems, especially for the UI.

Name any language besides Java which has them? There are wonderful
cross-platform C++ libraries, it's just that they are not standard.

And yet: not even Java is perfect in this respect. Its standard Swing
and AWT are so sluggish... On the other hand, there is IBM's SWT aka
Eclipse GUI, which is quite fast, but yet again: it's not standard...

-i.

July 28, 2003
> > I believe C++ to be a superior language than Java, though I use it a lot more so am probably biased. But I would not choose to use C++ to
implement
> > an e-commerce back-end, when J2EE is so simple, ubiquitous and reliable.
>
> You would not use it because it lacks something like J2EE, not because
Java
> is a better language. We have at last to differenciate between 'language', 'libraries' and 'environment'. Although C++ is a better language, it
totally
> lacks the Java 'envirornment' and tha Java 'libraries'.

Take your point here.

> > I would not use C, C++, C# or Java to write text file processing code. I
> use
> > Perl or Python (depending on whether I need more powerful regex or want
to
> > do a bit of OO in there).
>
> Again a problem of available libraries.

but not here. Regex is part of Perl's syntax, and I struggle to imagine a way in which it could be more simply & succinctly supported in C++


July 28, 2003
"Achilleas Margaritis" <axilmar@b-online.gr> wrote in message news:bg0gks$28j8$1@digitaldaemon.com...
> And since we are talking about libraries, D will make it only if it has standard libraries for gui, networking, database etc. Otherwise, it will
be
> like C++: a technically superior language, but not the language of choice.

Agree here. I am hoping that as soon as the remaining language debates are done that everyone will refocus on making significant, efficient, flexible and easy-to-use libraries


July 28, 2003
You will get libraries for any language that has a sufficient number of people interested in writing code in that language.  Obtain rights to the cream of the crop and call them standard.

Sean

"Matthew Wilson" <matthew@stlsoft.org> wrote in message news:bg28pl$1270$2@digitaldaemon.com...
> "Achilleas Margaritis" <axilmar@b-online.gr> wrote in message news:bg0gks$28j8$1@digitaldaemon.com...
> > And since we are talking about libraries, D will make it only if it has standard libraries for gui, networking, database etc. Otherwise, it will
> be
> > like C++: a technically superior language, but not the language of
choice.
>
> Agree here. I am hoping that as soon as the remaining language debates are done that everyone will refocus on making significant, efficient, flexible and easy-to-use libraries