December 24, 2009
Walter Bright wrote:
>> But you need to allocate this data from a shared garbage collection, 
> 
> You do anyway.

Consider you allocate normal data from a thread local heap, and shared data from a shared GC. We will need this anyway to get decent memory allocation and GC performance. Especially because the shared GC will have a single global lock, and will have to stop ALL threads in the process to scan for memory.

Now, where do you want to allocate immutable data from?

a) From the local heap: but then you can't just pass immutable data by reference to other threads. It obviously won't work, because the local GC may free the memory, even if other threads hold references to it.

b) From the shared GC: but then even only-locally used data like strings would have to be allocated from the shared GC! This would work, but performance of immutable would be godawful. No way you could do this outside of alpha versions of the language.

You could make a) work by copying the immutable data to the shared heap as soon as immutable data "escapes" a thread and may be accessed by other threads.

What will you do?

>> For large portions of data you could (at least in theory) make it _actually_ read-only by using mprotect (make the memory pages read-only).
> 
> Compile time checking is better than runtime checking.

Not if the language/compiler gets unusable.

>>> Yes, there was a recently discovered bug which enabled modifying an immutable array. This was a bug, and has been fixed. A bug does not mean the concept is broken.
>>
>> Sure, but the question is: will all those bugs ever to be fixed?
> 
> Forgive me, but every month 20 to 40 bugs get fixed. You can see it in the change log. I don't understand these complaints.

Frankly, I don't understand how you think that there's no problem. Even beginners can hit dmd bugs. Some basic language features are still buggy as hell (like forward referencing). Of course only if you actually try to use them.

> 
>> Also, how much is this reliability worth if you can just cast away immutable? It's even exactly the same syntax you have to use for relatively harmless things, like casting a float to an integer.
> 
> It's not allowed in @safe functions.

"To hell with un-@safe D"?

The current cast syntax, that allows immutable to be casted away, is just a damn wide open programmer trap that must be fixed.
December 24, 2009
Thu, 24 Dec 2009 17:41:30 +0100, grauzone wrote:

> Walter Bright wrote:

>>>> Yes, there was a recently discovered bug which enabled modifying an immutable array. This was a bug, and has been fixed. A bug does not mean the concept is broken.
>>>
>>> Sure, but the question is: will all those bugs ever to be fixed?
>> 
>> Forgive me, but every month 20 to 40 bugs get fixed. You can see it in the change log. I don't understand these complaints.
> 
> Frankly, I don't understand how you think that there's no problem. Even beginners can hit dmd bugs. Some basic language features are still buggy as hell (like forward referencing). Of course only if you actually try to use them.

10 year old forward reference errors don't matter at all since 20 to 40 *other* bugs get fixed every month.
December 24, 2009
grauzone wrote:
> What will you do?

Because of casting, there cannot be a thread-local only gc.

This does not make the gc inherently unusable. Java, for example, uses only one shared gc. It must, because Java has no concept of thread local.
December 24, 2009
Thu, 24 Dec 2009 11:59:57 -0800, Walter Bright wrote:

> grauzone wrote:
>> What will you do?
> 
> Because of casting, there cannot be a thread-local only gc.
> 
> This does not make the gc inherently unusable. Java, for example, uses only one shared gc. It must, because Java has no concept of thread local.

TLS is provided via library add-on http://java.sun.com/javase/6/docs/api/ java/lang/ThreadLocal.html
December 24, 2009
retard wrote:
> 10 year old forward reference errors don't matter at all since 20 to 40 *other* bugs get fixed every month.

Half of them show as fixed. http://d.puremagic.com/issues/show_bug.cgi?id=340
December 24, 2009
retard wrote:
> Thu, 24 Dec 2009 11:59:57 -0800, Walter Bright wrote:
> 
>> grauzone wrote:
>>> What will you do?
>> Because of casting, there cannot be a thread-local only gc.
>>
>> This does not make the gc inherently unusable. Java, for example, uses
>> only one shared gc. It must, because Java has no concept of thread
>> local.
> 
> TLS is provided via library add-on http://java.sun.com/javase/6/docs/api/
> java/lang/ThreadLocal.html

While Java can allocate thread local data, it has no *concept* of thread local data. Nothing at all prevents one from passing a reference to thread local data from one thread to another. Since nothing prevents this, it therefore cannot violate the Java memory model, and therefore must be supported by the gc.
December 25, 2009
Walter Bright wrote:
> grauzone wrote:
>> What will you do?
> 
> Because of casting, there cannot be a thread-local only gc.

I think this is a very bad idea. I thought TLS by default was just the beginning of separating threads better. While it will work in the initial stages of D2, I don't think it should be final.

Also, it's bad how inter-thread communication will trigger GC runs, which will stop all threads in the process for a while. Because you have no choice but to allocate your immutable messages from the shared heap. That can't be... I must be missing some central point.

> This does not make the gc inherently unusable. Java, for example, uses only one shared gc. It must, because Java has no concept of thread local.

What does Java matter here? Java was designed twenty years aho, when multicore wasn't an issue yet. D2 is designed *now* with good multicore support in mind.

Also I'm not sure if you're right here. Java has a generational copying GC, and although I don't know Sun Java's implementation at all, I'm quite sure the younger generation uses an entirely thread local heap. In the normal case, it shouldn't need to get a single lock to allocate memory. Just an unsynchronized pointer incrementation to get the next memory block (as fast as stack allocation). If a memory block "escapes" to the older generation (the GC needs to detect this case anyway), the memory can be copied to a shared heap.

This means old dumb Java will completely smash your super multicore aware D2. At least if someone wants to allocate memory... oh by the way, no way to prevent GC cycles on frequent memory allocations, even if the programmer knows that the memory isn't needed anymore: it seems manual memory managment is going to be deemed "evil". Or did I hear wrong that "delete" will be removed from D2?

By the way... this reminds me of Microsoft's Singularity kernel: they achieve full memory isolation between processes running in the same address, space without extending the type system with cruft like immutable. Processes can communicate like Erlang threads using the actor model.
December 25, 2009
grauzone wrote:
> Also, it's bad how inter-thread communication will trigger GC runs, 

No, it won't. Allocation may trigger a GC run.

> which will stop all threads in the process for a while. Because you have no choice but to allocate your immutable messages from the shared heap.

It all depends. Value messages are not allocated. Immutable data structures can be pre-allocated.


>> This does not make the gc inherently unusable. Java, for example, uses only one shared gc. It must, because Java has no concept of thread local.
> 
> What does Java matter here? Java was designed twenty years aho, when multicore wasn't an issue yet. D2 is designed *now* with good multicore support in mind.

It matters because Java is used a lot in multithreaded applications, and it is gc based. The gc is not a disastrous problem with it.


> Also I'm not sure if you're right here. Java has a generational copying GC, and although I don't know Sun Java's implementation at all, I'm quite sure the younger generation uses an entirely thread local heap. In the normal case, it shouldn't need to get a single lock to allocate memory. Just an unsynchronized pointer incrementation to get the next memory block (as fast as stack allocation). If a memory block "escapes" to the older generation (the GC needs to detect this case anyway), the memory can be copied to a shared heap.

Getting a memory block can be done with thread local pools, but the pools are from *shared* memory and when a collection cycle is done, it is done across all threads and shared memory.

> This means old dumb Java will completely smash your super multicore aware D2.

I think you're confusing allocating from a thread local cache from the resulting memory being thread local. The latter doesn't follow from the former.

> At least if someone wants to allocate memory... oh by the way, no way to prevent GC cycles on frequent memory allocations, even if the programmer knows that the memory isn't needed anymore: it seems manual memory managment is going to be deemed "evil". Or did I hear wrong that "delete" will be removed from D2?
> 
> By the way... this reminds me of Microsoft's Singularity kernel: they achieve full memory isolation between processes running in the same address, space without extending the type system with cruft like immutable. Processes can communicate like Erlang threads using the actor model.

Erlang is entirely based on immutability of data. The only "cruft" they got rid of was mutability!
December 25, 2009
Walter Bright wrote:
> grauzone wrote:
>> Also, it's bad how inter-thread communication will trigger GC runs, 
> 
> No, it won't. Allocation may trigger a GC run.
> 
>> which will stop all threads in the process for a while. Because you have no choice but to allocate your immutable messages from the shared heap.
> 
> It all depends. Value messages are not allocated. Immutable data structures can be pre-allocated.

As soon as you have slightly more complex data like a simple string, the trouble starts.

> 
>>> This does not make the gc inherently unusable. Java, for example, uses only one shared gc. It must, because Java has no concept of thread local.
>>
>> What does Java matter here? Java was designed twenty years aho, when multicore wasn't an issue yet. D2 is designed *now* with good multicore support in mind.
> 
> It matters because Java is used a lot in multithreaded applications, and it is gc based. The gc is not a disastrous problem with it.

For one, Java has an infinitely better GC implementation than D. Yeah, this isn't a problem with the concept or the language specification, but it matters in reality.

There's no way a shared GC is ever going to be scalable with multicores. If I'm wrong and it can be made scalable, I'd like to see it. Not just in theory, but in D.

> 
>> Also I'm not sure if you're right here. Java has a generational copying GC, and although I don't know Sun Java's implementation at all, I'm quite sure the younger generation uses an entirely thread local heap. In the normal case, it shouldn't need to get a single lock to allocate memory. Just an unsynchronized pointer incrementation to get the next memory block (as fast as stack allocation). If a memory block "escapes" to the older generation (the GC needs to detect this case anyway), the memory can be copied to a shared heap.
> 
> Getting a memory block can be done with thread local pools, but the pools are from *shared* memory and when a collection cycle is done, it is done across all threads and shared memory.
> 
>> This means old dumb Java will completely smash your super multicore aware D2.
> 
> I think you're confusing allocating from a thread local cache from the resulting memory being thread local. The latter doesn't follow from the former.

I didn't say thread local allocation couldn't improve the situation, but the problem is still there: a GC costs too much. You'll be escape the situation a while (e.g. by adding said thread local pools), but I think eventually you'll have to try something different.

I think having thread local heaps could be a viable solution, especially because the D2 type system is *designed* for it. All data is thread local by default, and trying to access it from other threads is forbidden and will break stuff. We have shared/immutable to allow inter-thread accesses. There will be *never* be pointers to non-shared/mutable between different thread. This just cries for allocating "normal" data on isolated separate per-thread heaps.

I was just wondering what you'd do about immutable data. But OK, you're not going this way. What a waste. We can end the discussion here, sorry for the trouble.

>> At least if someone wants to allocate memory... oh by the way, no way to prevent GC cycles on frequent memory allocations, even if the programmer knows that the memory isn't needed anymore: it seems manual memory managment is going to be deemed "evil". Or did I hear wrong that "delete" will be removed from D2?
>>
>> By the way... this reminds me of Microsoft's Singularity kernel: they achieve full memory isolation between processes running in the same address, space without extending the type system with cruft like immutable. Processes can communicate like Erlang threads using the actor model.
> 
> Erlang is entirely based on immutability of data. The only "cruft" they got rid of was mutability!

You could understand your argument as "having both is cruft". Maybe D2 would be better if we removed all mutable types?
December 25, 2009
grauzone wrote:
>> It matters because Java is used a lot in multithreaded applications, and it is gc based. The gc is not a disastrous problem with it.
> 
> For one, Java has an infinitely better GC implementation than D. Yeah, this isn't a problem with the concept or the language specification, but it matters in reality.

I thought we were talking about a fundamental issue of concept and language specification.


> There's no way a shared GC is ever going to be scalable with multicores. If I'm wrong and it can be made scalable, I'd like to see it. Not just in theory, but in D.

I believe there's plenty that can be achieved with it first. D has a fairly simple GC implementation in it right now, probably early 90's technology. It could be pushed an awful lot further.

If you want to help out with it, you're welcome to.


>> Erlang is entirely based on immutability of data. The only "cruft" they got rid of was mutability!
> You could understand your argument as "having both is cruft". Maybe D2 would be better if we removed all mutable types?

Pure functional languages (which is what not having mutable data are) are forced to buy into a whole 'nother set of problems (see "monads"). My impression is Erlang does one thing very very well (multithreading) and everything else, not so good.