Jump to page: 1 24  
Page
Thread overview
[dmd-concurrency] Smoke test
Jan 07, 2010
Michel Fortin
Jan 07, 2010
Sean Kelly
Jan 08, 2010
Walter Bright
Jan 08, 2010
Robert Jacques
Jan 08, 2010
Walter Bright
Jan 08, 2010
Robert Jacques
Jan 08, 2010
Michel Fortin
[dmd-concurrency] Vot de hekk is shared good for, anyway?
Jan 08, 2010
Walter Bright
Jan 08, 2010
Kevin Bealer
Jan 08, 2010
Michel Fortin
Jan 08, 2010
Walter Bright
Jan 09, 2010
Michel Fortin
Jan 09, 2010
Walter Bright
Jan 10, 2010
Graham St Jack
Jan 11, 2010
Sean Kelly
Jan 11, 2010
Graham St Jack
Jan 11, 2010
Michel Fortin
Jan 11, 2010
Graham St Jack
Jan 12, 2010
Graham St Jack
Jan 12, 2010
Michel Fortin
Jan 12, 2010
Graham St Jack
Jan 12, 2010
Walter Bright
Jan 11, 2010
Walter Bright
Jan 08, 2010
Sean Kelly
Jan 08, 2010
Walter Bright
Jan 08, 2010
Sean Kelly
Jan 08, 2010
Walter Bright
Jan 08, 2010
Walter Bright
[dmd-concurrency] tail-shared by default?
Jan 08, 2010
Walter Bright
Jan 09, 2010
Walter Bright
Jan 09, 2010
Michel Fortin
January 07, 2010
Le 2010-01-07 ? 15:51, Sean Kelly a ?crit :

> The smoke test for me is that I think about whether it would work with per-thread GCs.
> [...]
> the heap data could easily be handed off to the shared GC instead, but let's pretend this is impossible

I'd really like this rule to be the official "smoke test" for the concurrency model. It's simple to conceptualize, not too hard to implement, and adaptable to many situations. It works in a model where each processor has a local memory pool not accessible to others, or where each thread is in reality a different process with some shared memory between the process.

I'm not against promoting memory blocks to the shared GC, but I'd like this to be just a feature of the GC in the runtime, not a requirement for the concurrency model to make sense. In a way, it shouldn't be done implicitly.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



January 07, 2010
On Jan 7, 2010, at 1:18 PM, Michel Fortin wrote:
> 
> I'm not against promoting memory blocks to the shared GC, but I'd like this to be just a feature of the GC in the runtime, not a requirement for the concurrency model to make sense. In a way, it shouldn't be done implicitly.

That's a good point.  And it's certainly what I was going for until this shared reference issue threw a wrench in the works and I panicked :-).  I agree that this should be a part of the rule because I'd consider the model broken otherwise.
January 07, 2010

Sean Kelly wrote:
> On Jan 7, 2010, at 1:18 PM, Michel Fortin wrote:
> 
>> I'm not against promoting memory blocks to the shared GC, but I'd like this to be just a feature of the GC in the runtime, not a requirement for the concurrency model to make sense. In a way, it shouldn't be done implicitly.
>> 
>
> That's a good point.  And it's certainly what I was going for until this shared reference issue threw a wrench in the works and I panicked :-).  I agree that this should be a part of the rule because I'd consider the model broken otherwise.
> _______________________________________________
> 

Having a per-thread gc is an optimization, not a fundamental feature of the concurrency model. For one thing, it precludes casting data to immutable. For another, it may result in excessive memory consumption as one thread may have a lot of unused data in its pool that is not available for allocation by another thread.
January 07, 2010
On Thu, 07 Jan 2010 20:28:23 -0500, Walter Bright <walter at digitalmars.com> wrote:
> Having a per-thread gc is an optimization, not a fundamental feature of
> the concurrency model. For one thing, it precludes casting data to
> immutable. For another, it may result in excessive memory consumption as
> one thread may have a lot of unused data in its pool that is not
> available for allocation by another thread.
> _______________________________________________
> dmd-concurrency mailing list
> dmd-concurrency at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency

I would disagree; it's completely possible to allow safe casting to immutable/shared with thread-local GCs. It just requires language support. For example, adding a GC function which is called whenever a shared cast occurs. In the current GC, the function does nothing and everything proceeds as normal. With thread local GC, however, this function would publish the casted object to a list. The local GC could then pin all objects on the list and the shared GC could mark/sweep the list entries instead of the objects themselves.

Also, today a thread-local mark-sweep GC equals a modern concurrent shared GC (according to Apple). So going forward, I think thread local GCs will be a big thing.
January 07, 2010
Le 2010-01-07 ? 20:28, Walter Bright a ?crit :

> Having a per-thread gc is an optimization, not a fundamental feature of the concurrency model. For one thing, it precludes casting data to immutable. For another, it may result in excessive memory consumption as one thread may have a lot of unused data in its pool that is not available for allocation by another thread.

Both the "per-thread GC + shared GC" model and "the shared GC for everyone" model can be seen as optimizations. The first optimizes for speed, the second optimize for memory usage.

Depending on what you do, it might even make sense to have some threads using the shared GC for everything and other having a thread-local GC to improve speed.

If you want the language to be limited to models where the memory can always be shared between all threads, then that that's fine. It's your prerogative. I'm not so sure it's wise to limit shared semantics to this scenario just to avoid having the shared-immutable combo, but if you're sure that's what you want then I'll stick to it.

I'm glad you're part of this discussion Walter, because defining the concurrency model starts by delimiting precisely its goals and constrains, and it's important to have those stabilize those early if the rest is to make any sense.

Now, perhaps we need to revise this discussion Andrei, Sean, and me had earlier (in the "What is shared?" thread):

Sean Kelly wrote:
> Michel Fortin wrote:
>> Andrei Alexandrescu wrote:
>> 
>>> Conversely, an object _not_ marked as "shared" is definitely visible only within one thread.
>> 
>> Except for those marked immutable, which also implies shared.
> 
> Not necessarily. Immutable means that synchronization isn't necessary for data accessed by multiple threads, but it says nothing about visibility.


Looks like I was right.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



January 07, 2010

Michel Fortin wrote:
> Le 2010-01-07 ? 20:28, Walter Bright a ?crit :
>
> 
>> Having a per-thread gc is an optimization, not a fundamental feature of the concurrency model. For one thing, it precludes casting data to immutable. For another, it may result in excessive memory consumption as one thread may have a lot of unused data in its pool that is not available for allocation by another thread.
>> 
>
> Both the "per-thread GC + shared GC" model and "the shared GC for everyone" model can be seen as optimizations. The first optimizes for speed, the second optimize for memory usage.
>
> Depending on what you do, it might even make sense to have some threads using the shared GC for everything and other having a thread-local GC to improve speed.
>
> If you want the language to be limited to models where the memory can always be shared between all threads, then that that's fine. It's your prerogative. I'm not so sure it's wise to limit shared semantics to this scenario just to avoid having the shared-immutable combo, but if you're sure that's what you want then I'll stick to it.
>
> 

There's another aspect here. Consider all the problems we have getting across the idea of an immutable type. What hope is there for shared? I see mass confusion everywhere. Frankly, I see little hope of any but a handful of programmers ever being able to grok shared and use it correctly for concurrent programs. The notion that one can just slap 'shared' on a data type and have it work correctly across threads without further thought is a pipe dream.

So what to do?

I want to pin the mainstream concurrency on message passing. The message passing user never sees shared, never has to deal with locks, never has to deal with memory barriers. It just works. Message passing should be a robust, scalable solution for most users. I believe the Erlang experience validates this. Go and Scala also rely entirely on message passing (but they don't have immutable data, so their models are unsafe and I predict many rude surprises).

So why bother with shared at all?

Because message passing does not cover all the bases, and D is supposed to be a systems programming language. So we need a paradigm for synchronization and shared data structures. What shared provides is:

1. A way to identify shared data. This is incredibly important. A lot of sharing bugs come about because of inadvertant unrecognized sharing of data. This should be pretty much impossible in D. Furthermore, if you do have a sharing bug in your code, you look at the 1% of the data tagged as shared, rather than every freakin' line of code and every piece of data. Half the battle in debugging code is figuring out where to look for the problem. Shared pares that problem down to a reasonable size.

2. Shared comes with a collection of static typing rules and guarantees that will head off a number of concurrency bugs, such as sequential consistency.

I view shared as sort of like the latest electric arc welders which automatically adjust the current and wire feed for you. They dramatically shorten (but don't eliminate) the learning curve for people trying to master the art of welding. D is the only language to even attempt this. C++ leaves you completely on your own, Java offers no help, Erlang, Scala and Go throw in the towel and won't allow anything but message passing.

As for a shared gc vs thread local gc, I just see an awful lot of strange irreproducible bugs when someone passes data from one to the other. I doubt it's worth it, unless it can be done with compiler guarantees, which seem doubtful.
January 07, 2010

Robert Jacques wrote:
> On Thu, 07 Jan 2010 20:28:23 -0500, Walter Bright <walter at digitalmars.com> wrote:
>> Having a per-thread gc is an optimization, not a fundamental feature
>> of the concurrency model. For one thing, it precludes casting data to
>> immutable. For another, it may result in excessive memory consumption
>> as one thread may have a lot of unused data in its pool that is not
>> available for allocation by another thread.
>> _______________________________________________
>> dmd-concurrency mailing list
>> dmd-concurrency at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency
>
> I would disagree; it's completely possible to allow safe casting to immutable/shared with thread-local GCs. It just requires language support. For example, adding a GC function which is called whenever a shared cast occurs. In the current GC, the function does nothing and everything proceeds as normal. With thread local GC, however, this function would publish the casted object to a list. The local GC could then pin all objects on the list and the shared GC could mark/sweep the list entries instead of the objects themselves.

Sounds like the thread local pool will get peppered with shared islands inside it.

>
> Also, today a thread-local mark-sweep GC equals a modern concurrent
> shared GC (according to Apple). So going forward, I think thread local
> GCs will be a big thing.
> _______________________________________________
> dmd-concurrency mailing list
> dmd-concurrency at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency
>
>
January 08, 2010
On Fri, Jan 8, 2010 at 1:58 AM, Walter Bright <walter at digitalmars.com>wrote:

>
>
> Michel Fortin wrote:
>
>> Le 2010-01-07 ? 20:28, Walter Bright a ?crit :
>>
>>
>>
>>> Having a per-thread gc is an optimization, not a fundamental feature of
>>> the concurrency model. For one thing, it precludes casting data to
>>> immutable. For another, it may result in excessive memory consumption as one
>>> thread may have a lot of unused data in its pool that is not available for
>>> allocation by another thread.
>>>
>>>
>>
>> Both the "per-thread GC + shared GC" model and "the shared GC for
>> everyone" model can be seen as optimizations. The first optimizes for speed,
>> the second optimize for memory usage.
>>
>> Depending on what you do, it might even make sense to have some threads using the shared GC for everything and other having a thread-local GC to improve speed.
>>
>> If you want the language to be limited to models where the memory can always be shared between all threads, then that that's fine. It's your prerogative. I'm not so sure it's wise to limit shared semantics to this scenario just to avoid having the shared-immutable combo, but if you're sure that's what you want then I'll stick to it.
>>
>>
>>
>
> There's another aspect here. Consider all the problems we have getting across the idea of an immutable type. What hope is there for shared? I see mass confusion everywhere. Frankly, I see little hope of any but a handful of programmers ever being able to grok shared and use it correctly for concurrent programs. The notion that one can just slap 'shared' on a data type and have it work correctly across threads without further thought is a pipe dream.
>
> So what to do?
>
> I want to pin the mainstream concurrency on message passing. The message passing user never sees shared, never has to deal with locks, never has to deal with memory barriers. It just works. Message passing should be a robust, scalable solution for most users. I believe the Erlang experience validates this. Go and Scala also rely entirely on message passing (but they don't have immutable data, so their models are unsafe and I predict many rude surprises).
>
> So why bother with shared at all?
>
> Because message passing does not cover all the bases, and D is supposed to be a systems programming language. So we need a paradigm for synchronization and shared data structures. What shared provides is:
>
> 1. A way to identify shared data. This is incredibly important. A lot of sharing bugs come about because of inadvertant unrecognized sharing of data. This should be pretty much impossible in D. Furthermore, if you do have a sharing bug in your code, you look at the 1% of the data tagged as shared, rather than every freakin' line of code and every piece of data. Half the battle in debugging code is figuring out where to look for the problem. Shared pares that problem down to a reasonable size.
>
> 2. Shared comes with a collection of static typing rules and guarantees that will head off a number of concurrency bugs, such as sequential consistency.
>

For those of us who showed up late to class, is there a page or description that enumerates the typing rules and guarantees, or are those all still up in the air?

Kevin

I view shared as sort of like the latest electric arc welders which
> automatically adjust the current and wire feed for you. They dramatically shorten (but don't eliminate) the learning curve for people trying to master the art of welding. D is the only language to even attempt this. C++ leaves you completely on your own, Java offers no help, Erlang, Scala and Go throw in the towel and won't allow anything but message passing.
>
> As for a shared gc vs thread local gc, I just see an awful lot of strange
> irreproducible bugs when someone passes data from one to the other. I doubt
> it's worth it, unless it can be done with compiler guarantees, which seem
> doubtful.
> _______________________________________________
> dmd-concurrency mailing list
> dmd-concurrency at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/dmd-concurrency/attachments/20100108/25bc6343/attachment.htm>
January 08, 2010
On Jan 7, 2010, at 5:28 PM, Walter Bright wrote:
> 
> Having a per-thread gc is an optimization, not a fundamental feature of the concurrency model. For one thing, it precludes casting data to immutable. For another, it may result in excessive memory consumption as one thread may have a lot of unused data in its pool that is not available for allocation by another thread.

I agree completely that having a per-thread GC is simply an optimization.  I just brought it up because it's the simplest way I've come up with to think about what "shared" means.  Put another way, I think of shared as controlling the access control pool the data lives in, and the lifetime of that data.  If I see "T v1" then I should be able to infer that v1 is only visible to the current thread and will go away when the thread terminates.  Similarly, if I see "shared T v2" I should be able to infer that v2 is globally visible and may remain until process termination.

I feel like I'm not explaining myself very well, but that's the best I can do at the moment.  As a related issue, I have a feeling that the following is a bad idea, but I haven't come up with a good explanation for why yet, maybe simply the principle of least surprise?:

class C
{
    shared int x;
}

auto c = new C;
sendRefToAnotherThread( c ); // fails, c is local
sendToAnotherThread( &c.x ); // succeeds, c.x is shared
January 08, 2010

Sean Kelly wrote:
>
> I feel like I'm not explaining myself very well, but that's the best I can do at the moment.  As a related issue, I have a feeling that the following is a bad idea, but I haven't come up with a good explanation for why yet, maybe simply the principle of least surprise?:
>
> class C
> {
>     shared int x;
> }
>
> auto c = new C;
> sendRefToAnotherThread( c ); // fails, c is local
> sendToAnotherThread( &c.x ); // succeeds, c.x is shared
>
> 

The transitivity of shared doesn't work backwards, only forwards. In other words, you can have a local pointer to shared, but no shared pointers to locals.

In yet other words, sharing is transitive, locality is not.
« First   ‹ Prev
1 2 3 4