May 11, 2014
On 5/10/2014 8:58 PM, Manu via Digitalmars-d wrote:
> This is truly a niche usage case though,

Come on! Like about 80% of the programs on any linux box? Like the OCR program I run? A payroll processing program? Any scientific numerical analysis program? Engineering programs?


> If you're prepared to label the largest entertainment industry on the
> planet a niche,

I'm not doing that.


> How many libs does DMD link?

We've gone over this before. You were concerned that the libraries you linked with were incompetently written, and implied that if ARC was pervasive, they would be competently written. I can guarantee you, however, that ARC leaves plenty of opportunity for incompetence :-)


> I'm hard pressed to think of any software I see people using every day
> which isn't realtime in some sense.

Programs you "see" by definition have a user interface. But an awful lot of programs are not seen, but that doesn't mean they aren't there and aren't running. See my list above.


>> Compiler improvements to improve on ARC are the same technology as used to
>> improve GC.
>>
>> You're essentially arguing that one is easy pickings and the other is
>> impractically difficult, but they're actually about the same level.
>
> Are they? This is the first I've heard such a claim. If that's the
> case,

Yes.


> then that why is there an argument?

Because if this was an easy problem, it would have been solved. In particular, if the ARC overhead was easily removed by simple compiler enhancements, why hasn't ARC taken the world by storm? It's not like ARC was invented yesterday.


> That work just needs to be done,

That's a massive understatement. This is PhD research topic material, not something I can churn out in a week or two if only I had a more positive attitude :-)

> but by all prior
> reports I've heard, awesome GC is practically incompatible with D for
> various reasons.

There is no such thing as a GC which would satisfy your requirements.

> No matter how awesome it is, it seems conceptually
> incompatible with my environment.

My point!

> I keep saying, I'm more than happy to be proved wrong (really!), but
> until someone can, then it's unfair to dismiss my arguments so
> easily...

I believe I have answered your arguments, not dismissed them.

> and I just don't think it's that unreasonable. ARC is an
> extremely successful technology, particularly in the
> compiled/native/systems language space (OC, C++/CX,

What was dismissed is the reality pointed out many times that those systems resolve the perf problems of ARC by providing numerous means of manually escaping it, with the resulting desecration of soundness guarantees.


> Rust).

Rust is not an extremely successful technology. It's barely even been implemented.


> Is there
> actually any evidence of significant GC success in this space?
> Successes all seem to be controlled VM based languages like Java and
> C#; isolated languages with no intention to interact with existing
> native worlds. There must be good reason for that apparent separation
> in trends?

There are many techniques for mitigating GC problems in D, techniques that are not available in Java or C#. You can even do shared_ptr<> in D. You can use @nogc to guarantee the GC pause troll isn't going to pop up unexpectedly. There are a bunch of other techniques, too.

ARC simply is not a magic, no problem solution one can use without careful thought in large, complex systems. (Of course, neither is GC nor any other memory management scheme.)

May 11, 2014
On 11 May 2014 14:57, Walter Bright via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> On 5/10/2014 8:58 PM, Manu via Digitalmars-d wrote:
>>
>> This is truly a niche usage case though,
>
>
> Come on! Like about 80% of the programs on any linux box? Like the OCR program I run? A payroll processing program? Any scientific numerical analysis program? Engineering programs?

Linux programs aren't at risk of being ported from C any time soon.
And If the GC is a library, shell apps probably represent the set
where it is the most convenient to make use of a GC lib.
Accounting software is UI based software. It's just of a sort that
people aren't likely to notice or complain if it stutters
occasionally.
Engineering programs are typically realtime in some way. Most
productivity software of that sort (CAD, ART/design, etc) is highly
interactive.

Tools like OCR as a shell app, refer to the short-lived shell app point. OCR as a feature of an art package (like photoshop), again, highly interactive productivity software.


>> How many libs does DMD link?
>
>
> We've gone over this before. You were concerned that the libraries you linked with were incompetently written, and implied that if ARC was pervasive, they would be competently written. I can guarantee you, however, that ARC leaves plenty of opportunity for incompetence :-)

It's not about incompetence, it's about incompatibility.
If I can't tolerate a collect, not only do I sacrifice extensive and
productive parts of the language, this means a library which I have no
control over is banned from the GC outright. That is entirely
unrealistic.

I can approach a performance hazard which manifests locally, as with ARC. I have no way to address the sort that manifests in random places, at random times, for reasons that are outside of my control, interferes with the entire application, and increases in frequency as the free memory decreases (read: as I get closer to shipping day).


>> I'm hard pressed to think of any software I see people using every day which isn't realtime in some sense.
>
>
> Programs you "see" by definition have a user interface. But an awful lot of programs are not seen, but that doesn't mean they aren't there and aren't running. See my list above.

Sure, and of the subset of those which run in a time-critical
environment, they are the best candidates to make use of GC as a lib.
As I see, most of those apps these days are more likely to be written
in a language like python, so it's hard to make any raw performance
argument about that category of software in general. It would be an
extremely small subset where it is a consideration.
Conversely, all realtime/UI based/user-facing software cares about
performance and stuttering.


>> then that why is there an argument?
>
>
> Because if this was an easy problem, it would have been solved. In particular, if the ARC overhead was easily removed by simple compiler enhancements, why hasn't ARC taken the world by storm? It's not like ARC was invented yesterday.

And as far as I can tell, it has, at least in this
(native/compiled/systems) space. O-C, C++/CX, Rust... where are the
counter-examples?
I don't think comparing D to Java/C# is a very good comparison
compared to the other native languages which do truly exist on the
same playing field.


>> That work just needs to be done,
>
>
> That's a massive understatement. This is PhD research topic material, not something I can churn out in a week or two if only I had a more positive attitude :-)

That's okay, I am just looking for direction. I want to see a
commitment to a path I can get behind. Not a path that, however many
years from now, I still have good reason to believe it won't satisfy
my requirements.
All my energy in the meantime would be a waste in that case.


>> but by all prior
>> reports I've heard, awesome GC is practically incompatible with D for
>> various reasons.
>
>
> There is no such thing as a GC which would satisfy your requirements.

Then... the argument is finished.
You have a choice. You may choose to explore the more inclusive
technology (which also solves some other outstanding language
problems, like destructors), or you confirm D is married to GC, and I
consider my options and/or future involvement in that context.


>> No matter how awesome it is, it seems conceptually incompatible with my environment.
>
>
> My point!

My point too!


>> and I just don't think it's that unreasonable. ARC is an extremely successful technology, particularly in the compiled/native/systems language space (OC, C++/CX,
>
>
> What was dismissed is the reality pointed out many times that those systems resolve the perf problems of ARC by providing numerous means of manually escaping it, with the resulting desecration of soundness guarantees.

You've said that, but I don't think there's any hard evidence of that.
It is an option, of course, which is extremely valuable to have, but I
don't see any evidence that it is a hard requirement.
Andrei's paper which asserted that modern ARC fell within 10% of "the
fastest GC" didn't make that claim with the caveat that "extensive
unsafe escaping was required to produce these results".


>> Rust).
>
>
> Rust is not an extremely successful technology. It's barely even been implemented.

Maybe... but they probably had extensive arguments on the same issues, and their findings should surely be included among the others. It's definitely modern, and I'm sure it was considered from a modern point of view, which I think is meaningful.


>> Is there
>> actually any evidence of significant GC success in this space?
>> Successes all seem to be controlled VM based languages like Java and
>> C#; isolated languages with no intention to interact with existing
>> native worlds. There must be good reason for that apparent separation
>> in trends?
>
>
> There are many techniques for mitigating GC problems in D, techniques that are not available in Java or C#. You can even do shared_ptr<> in D. You can use @nogc to guarantee the GC pause troll isn't going to pop up unexpectedly. There are a bunch of other techniques, too.

RC is no good without compiler support. One second you're arguing precisely this case that useful RC requires extensive compiler support to be competitive (I completely agree), and then you flip about and I hear the "use RefCounted!" argument again (which also has no influence on libraries I depend on).

I've argued before that @nogc has no practical effect, unless you tag it on main(), and then D is as good as if it didn't have memory management at all. Non-C libraries are practically eliminated. Sounds realistic in a shell app perhaps, but not in a major software package. D's appeal depends largely on it's implicit memory management, and convenience/correctness oriented constructs that it enables.

Both these suggestions only have any effect over my local code, which ignores the library problem again.


> ARC simply is not a magic, no problem solution one can use without careful thought in large, complex systems. (Of course, neither is GC nor any other memory management scheme.)

I have never claimed it's magic. It's **workable**. GC is apparently
not, as you admitted a few paragraphs above.
The key difference is that ARC cost is localised, which presents many
options. GC cost is unpredictable, and gets progressively worse as
environments become more and more like mine. Short of banning memory
management program-wide (absurd, it's 2014), or having such an excess
(waste) of available resources that I'm sabotaging competitive
distinction, theres not really workable options.

If it's true that ARC falls within 10% of the best GC's, surely it must be considered a serious option, especially considering we've started talking about ideas like "maybe we should make things with destructors lower to use ARC"?

Performance, it turns out, is apparently much more similar than I had
imagined, which would lead me to factor that out as a significant
consideration. Which is the more _inclusive_ option?
And unless D is capable of 'worlds fastest GC', then ARC would
apparently be a speed improvement over the current offering too.
May 11, 2014
On 5/10/14, 11:27 PM, Manu via Digitalmars-d wrote:
> On 11 May 2014 14:57, Walter Bright via Digitalmars-d
> <digitalmars-d@puremagic.com> wrote:
>> On 5/10/2014 8:58 PM, Manu via Digitalmars-d wrote:
>>>
>>> This is truly a niche usage case though,
>>
>>
>> Come on! Like about 80% of the programs on any linux box? Like the OCR
>> program I run? A payroll processing program? Any scientific numerical
>> analysis program? Engineering programs?
>
> Linux programs aren't at risk of being ported from C any time soon.

Is this seriously being aired as an argument? -- Andrei
May 11, 2014
Am 06.05.2014 05:40, schrieb Manu via Digitalmars-d:
>
> Does ~this() actually work, or just usually work?
> Do you call your destructors manually like C#?

Back when I actually used D's GC, it did usually work. There were a few workarounds because destructors would be called in the wrong thread, but other than that it did work for me.

By now I call my destructors manually like in C++. I do full manual memory management (I have a modified version of druntime that does not even contain the GC anymore). I always liked D because of its metaprogramming capabilities and not because of the automatic memory management. So I use D as a C++ with better meta programming. But this also means that I do not use phobos. I actually tried using only parts of phobos, but the dependency hell in phobos results in all modules being pulled in as soon as you use one central one, so I stopped using most of phobos alltogether. The annoying part, as you already pointed out, is that I can't use any third party libraries. So I'm living in my own little D world.

So I would be disappointed if classes suddently don't have destructors anymore. Most annoying would be the lack of compiler generated field destructors (e.g. automatically calling the destructor of each struct in the class). I could work around that, but it would be another anoyance over C++ and as they are piling up it becomes more and more viable to just go back to C++.
>
> I support the notion that if the GC isn't removed as a foundational
> feature of D, then destructors should probably be removed from D.
> That said, I really want my destructors, and would be very upset to
> see them go. So... ARC?
>

I think ARC could work, but should be extended with some sort of ownership notation. Often a block of memory (e.g. an array of data) is exclusivly owned by a single object. So it would be absolutly uneccessary to reference count that block of memory. Instead I would want something like Rust has, borrowed pointers. (We actually already have that, "scope" but its not defined nor implemented for anything but delegates)

Kind Regards
Benjamin Thaut

May 11, 2014
Am 10.05.2014 19:54, schrieb Andrei Alexandrescu:
>
>> The next sentence goes on to list the advantages of RC (issues we have
>> wrestled with, like destructors), and then goes on to say the recent
>> awesome RC is within 10% of "the fastest tracing collectors".
>> Are you suggesting that D's GC is among 'the fastest tracing
>> collectors'? Is such a GC possible in D?
>
> I believe it is.
>

While it might be possible to implement a good GC in D it would require major changes in the language and its librariers. In my opinion it would be way more work to implement a propper GC than to implement ARC.

Every state of the art GC requires percise knowdelge of _all_ pointers. And thats exactly what we currently don't have in D.

Just to list a few issues with implementing this in D:

1) Calling C-Functions would require some kind of pinning API for the GC. But functions with C-calling convections are used ecessivly in druntime. So a lot of uneccessary pinning would happen within d-runtime just because of all the c-function hacks around the module system.

2) Most modern GC algorithms require write-barriers for pointers which are on the heap. This is a major problem for D, because structs can be both on the stack and on the heap. So if you call a method of a struct, the compiler can not know if the write barriers are actually neccessary (for heap based structs) or unneccssary (for stack based structs, e.g. inside classes).

3) Generating percise information of the location of pointers on the stack is going to be a major effort, because of structs, unions, exception handling and Ds ability to move struct instances freely around when dealing with exceptions.

4) As soon as you want to be fully percise you can not halt threads in random points, you will need to generate GC points on which the execution can be safely halted. This is going to require major effort again if it should be done efficiently.

5) As Ds strings are immutable(char)[] they can not be allocated on a thread local pool, because they must be freely shareable between threads, and there is currently no way to detect via the type system whean a immutable(char)[] is shared with another thread. So we would need to design a system were the sharing of memory blocks would be detected, so we could properly implement thread local GC pools. The same issue applies to __gshared.

Honenstly, D doesn't currently have any GC support at all. The only GC support we have is that the GC knows all threads and can halt them. And D knows which function to call, to destroy class objects. So basically we have the same level of GC support as C or C++. You could just use the Boem GC, plug it into C or C++ and have a GC that is as good as the D one.

So having a good GC in 5 years or having ARC in 1 year isn't a decision that is that hard for me.

Kind Regards
Benjamin Thaut
May 11, 2014
On 5/10/2014 11:27 PM, Manu via Digitalmars-d wrote:
>> Because if this was an easy problem, it would have been solved. In
>> particular, if the ARC overhead was easily removed by simple compiler
>> enhancements, why hasn't ARC taken the world by storm? It's not like ARC was
>> invented yesterday.
>
> And as far as I can tell, it has, at least in this
> (native/compiled/systems) space. O-C, C++/CX, Rust... where are the
> counter-examples?

Again, O-C and C++/CX ARC are not memory safe because in order to make it perform they provide unsafe escapes from it. Neither even attempts pervasive ARC.

Rust is simply not an example of proven technology.

We cannot even discuss this if we cannot agree on basic, objective facts.

> It's **workable**.

Nobody has demonstrated that pervasive ARC is both performant and memory safe.

Have you ever written some code using RC in O-C or C++/CX, and disassembled it to see what it looks like? Do you realize that every decrement must happen inside an exception handler? Now imagine that for all code that deals with pointers?

------------- A Comment on Rust ------------------

This is based on my very incomplete knowledge of Rust, i.e. just reading a few online documents on it. If I'm wrong, please correct me.

Rust's designers apparently are well aware of the performance cost of pervasive ARC. Hence, they've added the notion of a "borrowed" pointer, which is an escape from ARC. The borrowed pointer is made memory safe by:

1. Introducing restrictions on what can be done with a borrowed pointer so the compiler can determine its lifetime. I do not know the extent of these restrictions.

2. Introducing an annotation to distinguish a borrowed pointer from an ARC pointer. If you don't use the annotation, you get pervasive ARC with all the poor performance that entails.

Implicit in borrowed pointers is Rust did not solve the problem of having the compiler eliminate unnecessary inc/dec.


My experience with pointer annotations to improve performance is pretty compelling - almost nobody adds those annotations. They get their code to work with the default, and never get around to annotating it. This, of course, provided me with a large opportunity to kick ass in the performance dept. because I would use them, but that didn't help when I had to use other peoples' code.

People who have added annotations to Java have seen the same result. They can't get regular programmers to use them.

The annotations have their downsides even if you make the effort to use them. Since they are a different type from ARC pointers, you cannot have a data structure, say a tree, that contains both (without having a tag to say which one it is). They do not mix. A function taking one type of pointer cannot be called with the other type.

Worse, these effects are transitive, making a function hierarchy rather inflexible.

Are these valid concerns with Rust? I haven't written any Rust code, and I haven't heard of a whole lot of Rust code being written. So I don't know. We'll see.
May 11, 2014
On 5/11/2014 1:22 AM, Benjamin Thaut wrote:
> Honenstly, D doesn't currently have any GC support at all. The only GC support
> we have is that the GC knows all threads and can halt them. And D knows which
> function to call, to destroy class objects. So basically we have the same level
> of GC support as C or C++. You could just use the Boem GC, plug it into C or C++
> and have a GC that is as good as the D one.

This is not quite correct. The Boehm GC knows nothing about the interiors of structs.

The D one does (or at least has the capability of doing so using RTinfo). This means that the D collector can be 'mostly' precise, the imprecision would be for stack data.

The Boehm collector cannot move objects around, the D one can.

May 11, 2014
On 5/11/2014 1:22 AM, Benjamin Thaut wrote:
> 2) Most modern GC algorithms require write-barriers for pointers which are on
> the heap. This is a major problem for D, because structs can be both on the
> stack and on the heap. So if you call a method of a struct, the compiler can not
> know if the write barriers are actually neccessary (for heap based structs) or
> unneccssary (for stack based structs, e.g. inside classes).

I know about write barriers. The thing about D is, far fewer objects are allocated on the GC heap than in languages like Java, which don't have stack allocation at all.

The write barrier would have to be there for every write through a pointer. This is justifiable where every pointer points to the GC heap, but I really do not believe it is justifiable for D, where only a minority does. D could not be performance competitive with C++ if write barriers were added.

D also cannot be performance competitive with C++ if pervasive ARC is used and memory safety is retained. Rust is attempting to solve this problem by using 'borrowed' pointers, but this is unproven technology, see my reply to Manu about it.
May 11, 2014
Am 11.05.2014 10:53, schrieb Walter Bright:
> On 5/11/2014 1:22 AM, Benjamin Thaut wrote:
>
> This is not quite correct. The Boehm GC knows nothing about the
> interiors of structs.
>
> The D one does (or at least has the capability of doing so using
> RTinfo). This means that the D collector can be 'mostly' precise, the
> imprecision would be for stack data.

Mostly percise doesn't help. Its either fully percise or beeing stuck with a impercise mark & sweep. Also D's GC doesn't use RTInfo currently. Additionally Rainer Schuetzes implementation showed that using RTInfo with the current GC makes the collection actually slower at the price of beeing 'mostly' percise. So I argue that the level of GC support we have is still at the level of C/C++.

>
> The Boehm collector cannot move objects around, the D one can.
>

Oh it can? Really? I would love to see how well interfacing with any C library works, if the D garbage collector actually does that. I bet GTKd would break the first second this is implemented. Or any other C library that exchanges data with D for that matter. The D garbage collector can not simply move around objects because we don't support pinning.

Also I'm talking about what the D garbage collector currently actually does, and not what it could do. If we actually would implement arc, it would most likely take less time then a full blown proper gc and we would end up with better performance than we currently have. Beating the 300% slowdown which the current GC imposes is not that hard.

If we however keep arguing what a GC could do, we will be stuck with the impercise mark & sweep forever. (or at least the next 5 years)

May 11, 2014
On 11 May 2014 17:52, Benjamin Thaut via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
> Am 06.05.2014 05:40, schrieb Manu via Digitalmars-d:
>
>> I support the notion that if the GC isn't removed as a foundational feature of D, then destructors should probably be removed from D.
>>
>> That said, I really want my destructors, and would be very upset to see them go. So... ARC?
>>
>
> I think ARC could work, but should be extended with some sort of ownership notation. Often a block of memory (e.g. an array of data) is exclusivly owned by a single object. So it would be absolutly uneccessary to reference count that block of memory. Instead I would want something like Rust has, borrowed pointers. (We actually already have that, "scope" but its not defined nor implemented for anything but delegates)

Indeed, I also imagine that implementation of 'scope' would allow for a really decent ARC experience. D already has some advantages over other languages, but that one would be big.