May 02, 2009
On Sat, 02 May 2009 02:34:57 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Robert Jacques wrote:
>> On Fri, 01 May 2009 19:25:35 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> Robert Jacques wrote:
>>>> Do you have a link to your article?
>>>
>>> http://tinyurl.com/dac82a
>>>
>>>
>>> Andrei
>>  Should've seen that one coming. :)
>>  Anyways, I'm not sure how you can call the technique lock-free, since you're doing (possibly several) allocations inside the inner CAS loop.
>
> No, repeated allocations are trivial to eliminate. I didn't even bother to explain that in the article. The loop must only refill the allocated object from the object that needs to be replaced.

putting the test in the pseudo code for this would've help my understanding.

if(copy is null)
    copy = s.deepDup;
else
    s.deepDupTo(copy);

>> (I guess a good, parallel allocator might negate this) It's probably not an issue for the article's use case, where reads vastly dominated updates, but for things like stacks or queues, it wouldn't work. And poor performance leads people to create their own fast, possibly buggy versions, thus defeating the point of a standard library.
>
> Incidentally Maged Michael himself wrote the first lock-free malloc().

Cool.

>
>
> Andrei

But Andrei, I really don't care about the existence of a lock-free algorithm with value semantics. I only care about a practical lock-free algorithm with value semantics. This approach, while elegant in its simplicity and good for some situations, is very poor at high update / contention objects, like stacks and queues. Maybe, locked algorithms are the best value semantics can do in some situations and this might result in separate fast reference containers. Which is a drawback to value semantics and should be placed into the big pros/cons pile. Of course, I could be missing something (again).

P.S. A general thank you all your posts and explanations.
May 02, 2009
Robert Jacques wrote:
>> No, repeated allocations are trivial to eliminate. I didn't even bother to explain that in the article. The loop must only refill the allocated object from the object that needs to be replaced.
> 
> putting the test in the pseudo code for this would've help my understanding.
> 
> if(copy is null)
>     copy = s.deepDup;
> else
>     s.deepDupTo(copy);

copy = cast(T*) GC.malloc(T.sizeof);
do {
    overwrite(s, copy);
    copy.modify;
} while (!cas(copy, s));

> But Andrei, I really don't care about the existence of a lock-free algorithm with value semantics. I only care about a practical lock-free algorithm with value semantics. This approach, while elegant in its simplicity and good for some situations, is very poor at high update / contention objects, like stacks and queues. Maybe, locked algorithms are the best value semantics can do in some situations and this might result in separate fast reference containers. Which is a drawback to value semantics and should be placed into the big pros/cons pile.

I agree.


Andrei
May 02, 2009
Robert Jacques wrote:
> On Fri, 01 May 2009 15:37:56 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:
>> All containers need to support copying.
> 
> No they don't.

There is no conceptual problem with a non-copyable struct.  However, in order to be a considered a container, the struct must at least support read-only iteration over its contents.  If iteration is possible, then so is copying.

You can have non-copyable value types, but they can't be containers.


-- 
Rainer Deyke - rainerd@eldwood.com
May 02, 2009
On Sat, 02 May 2009 03:35:51 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Robert Jacques wrote:
>>> No, repeated allocations are trivial to eliminate. I didn't even bother to explain that in the article. The loop must only refill the allocated object from the object that needs to be replaced.
>>  putting the test in the pseudo code for this would've help my understanding.
>>  if(copy is null)
>>     copy = s.deepDup;
>> else
>>     s.deepDupTo(copy);
>
> copy = cast(T*) GC.malloc(T.sizeof);
> do {
>      overwrite(s, copy);
>      copy.modify;
> } while (!cas(copy, s));

I'm confused. The article talks about copying the entire data structure, not just a node/etc. And trees, etc tend to have variable sizes, etc.

>> But Andrei, I really don't care about the existence of a lock-free algorithm with value semantics. I only care about a practical lock-free algorithm with value semantics. This approach, while elegant in its simplicity and good for some situations, is very poor at high update / contention objects, like stacks and queues. Maybe, locked algorithms are the best value semantics can do in some situations and this might result in separate fast reference containers. Which is a drawback to value semantics and should be placed into the big pros/cons pile.
>
> I agree.
>
>
> Andrei

May 02, 2009
On Sat, 02 May 2009 03:39:27 -0400, Rainer Deyke <rainerd@eldwood.com> wrote:

> Robert Jacques wrote:
>> On Fri, 01 May 2009 15:37:56 -0400, Rainer Deyke <rainerd@eldwood.com>
>> wrote:
>>> All containers need to support copying.
>>
>> No they don't.
>
> There is no conceptual problem with a non-copyable struct.  However, in
> order to be a considered a container, the struct must at least support
> read-only iteration over its contents.  If iteration is possible, then
> so is copying.
>
> You can have non-copyable value types, but they can't be containers.

No they don't. Iteration can often be destructive. If I iterate over a stack or a queue, I'm left with an empty stack/queue. For value semantics to work non-destructive copying is required.

May 02, 2009
On 2009-05-01 13:44:58 -0400, "Robert Jacques" <sandford@jhu.edu> said:

> I think you might of misunderstood the concept of a unique/mobile type. If  you take a slice or a reference of an unique, you null the source. Which  makes using unique T logically invalid as a container.

Take note: I'm not expecting it to copy automatically, "unique T[]" isn't a value-type container. I'm solving the problem of slices detaching from one another when you grow them, which I don't think is a problem addressed by value-type containers either. If you grow the slice or container, previously-taken slices of it may or may not be left dangling. Allowing grow only on "unique T[]" makes sure there are no slice to be invalidated when you grow.

Also, generally type systems featuring unique also have that "lent" flag allowing you to pass the unique reference to functions which are required to not keep it anywhere after the function returns. So you could pass lent slices of your unique T[] to functions without breaking uniqueness of the reference.


> Also, shared unique  T can't be converted to shared T. This incompatibility is the whole point  of having a unique type in the first place, and negates your concept of  using unique T as containers.

Why couldn't it? You just lose (forever) the uniqueness of the reference when you copy it to a non-unique one.


> And generally, unique T implies shared  storage, and can not be used for thread-local arrays.

Generally, but not necessarily. Any problem for unique be applicable to thread-local arrays?


-- 
Michel Fortin
michel.fortin@michelf.com
http://michelf.com/

May 02, 2009
On Fri, May 1, 2009 at 6:25 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> Bill Baxter wrote:
>>
>> On Fri, May 1, 2009 at 4:23 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>>
>>> I guess at that point the would-be D user would be entitled to make me a lavaliere out of my Matrix library and move on.
>>
>> Python, and therefore NumPy, are reference based.
>
> In a language that's reference based, things can't be helped. They can be helped in D though. I wouldn't want to later cringe that I missed the opportunity.
>
>> So far I haven't seen any scientists posting on the list about how having their arrays and matrices be references is driving them crazy. It may be surprising to some of them at first, but even non-hard-core coders seem to be able to handle it.
>
> To me this sounds more like an opportunity to do the right thing.

Could be.  I'm not arguing that refs are best, just giving some evidence from the NumPy community to counter your assertion that scientists would be confused by reference-based array and matrix types.


>> It helps that dummy assignments
>> like a = b are rare.   More often you have things like a = b + c, and
>> that creates a new matrix.  Or  a += b, which pretty clearly mutates
>> a.
>
> Oh rly. How about this then?
>
> Matrix a, b, c;
> ...
> c = a;
> a += b;
>
> Does the last operation mutate c as well?

I said "assignments like a = b are rare" and you put one of those in your example.  Yes, when you have an a=b anywhere you've got to pay attention and make sure you didn't mean a=b.dup.

And the other thing is that when you pass a matrix to a function, you need to be aware of whether the function mutates that matrix or not. That's really the biggest pain, and source of hardest to find bugs.

Matlab goes the other extreme and makes everything value semantics. It does a lot of COW optimizations under the hood to make it tolerably efficient even when tossing around big matrices.  But it's annoying if you want to write a function to make a small update to an existing matrix.  You have no choice but to return a copy and hope that Matlab's optimizer is smart enough to eliminate the unnecessary copy.

I agree with the poster below that what I usually want is value semantics with small values and matrices, reference semantics with big matrices.

--bb
May 02, 2009
Robert Jacques wrote:
> On Sat, 02 May 2009 03:35:51 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
> 
>> Robert Jacques wrote:
>>>> No, repeated allocations are trivial to eliminate. I didn't even bother to explain that in the article. The loop must only refill the allocated object from the object that needs to be replaced.
>>>  putting the test in the pseudo code for this would've help my understanding.
>>>  if(copy is null)
>>>     copy = s.deepDup;
>>> else
>>>     s.deepDupTo(copy);
>>
>> copy = cast(T*) GC.malloc(T.sizeof);
>> do {
>>      overwrite(s, copy);
>>      copy.modify;
>> } while (!cas(copy, s));
> 
> I'm confused. The article talks about copying the entire data structure, not just a node/etc. And trees, etc tend to have variable sizes, etc.

You can reuse memory even if it comprehends more complex patterns than one single allocation.

Andrei
May 02, 2009
On Sat, 02 May 2009 18:08:30 +0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:

> Robert Jacques wrote:
>> On Sat, 02 May 2009 03:35:51 -0400, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>
>>> Robert Jacques wrote:
>>>>> No, repeated allocations are trivial to eliminate. I didn't even bother to explain that in the article. The loop must only refill the allocated object from the object that needs to be replaced.
>>>>  putting the test in the pseudo code for this would've help my
>>>> understanding.
>>>>  if(copy is null)
>>>>     copy = s.deepDup;
>>>> else
>>>>     s.deepDupTo(copy);
>>>
>>> copy = cast(T*) GC.malloc(T.sizeof);
>>> do {
>>>      overwrite(s, copy);
>>>      copy.modify;
>>> } while (!cas(copy, s));
>>  I'm confused. The article talks about copying the entire data
>> structure, not just a node/etc. And trees, etc tend to have variable
>> sizes, etc.
>
> You can reuse memory even if it comprehends more complex patterns than one single allocation.
>
> Andrei

I *highly* doubt it is worth the trouble. Most likely, this container won't be lock-free and scalable anymore. Performance will also degrade dramatically.

Besides, the more I think about thread-local/shared separation that is going to come to D2, the more I see that there is absolutely no need to make Phobos containers thread-safe. Most containers will be thread-local and won't need any synchronization at all.
May 02, 2009
Bill Baxter wrote:
> On Fri, May 1, 2009 at 6:25 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>> Bill Baxter wrote:
>>> On Fri, May 1, 2009 at 4:23 PM, Andrei Alexandrescu <SeeWebsiteForEmail@erdani.org> wrote:
>>>> I guess at that point the would-be D user would be entitled to
>>>> make me a lavaliere out of my Matrix library and move on.
>>> Python, and therefore NumPy, are reference based.
>> In a language that's reference based, things can't be helped. They
>> can be helped in D though. I wouldn't want to later cringe that I
>> missed the opportunity.
>> 
>>> So far I haven't seen any scientists posting on the list about
>>> how having their arrays and matrices be references is driving
>>> them crazy. It may be surprising to some of them at first, but
>>> even non-hard-core coders seem to be able to handle it.
>> To me this sounds more like an opportunity to do the right thing.
> 
> Could be.  I'm not arguing that refs are best, just giving some evidence from the NumPy community to counter your assertion that scientists would be confused by reference-based array and matrix types.
> 
> 
>>> It helps that dummy assignments like a = b are rare.   More often
>>> you have things like a = b + c, and that creates a new matrix.
>>> Or  a += b, which pretty clearly mutates a.
>> Oh rly. How about this then?
>> 
>> Matrix a, b, c; ... c = a; a += b;
>> 
>> Does the last operation mutate c as well?
> 
> I said "assignments like a = b are rare" and you put one of those in your example.

You said it and I don't buy it one femtosecond. Code does need to copy
values around. It's a fundamental operation!

> Yes, when you have an a=b anywhere you've got to pay attention and
> make sure you didn't mean a=b.dup.

That is terrible. Anywhere == a portion of the code far, far away.

> And the other thing is that when you pass a matrix to a function, you
>  need to be aware of whether the function mutates that matrix or not.
>  That's really the biggest pain, and source of hardest to find bugs.

I bet. Things get even more unpleasant when you have object that have reference members. Then the default constructor does The Wrong Thing(tm) - many C++ pundits and authors have eaten many good lunches with the money they got discussing the problem.

> Matlab goes the other extreme and makes everything value semantics. It does a lot of COW optimizations under the hood to make it
> tolerably efficient even when tossing around big matrices.  But it's
> annoying if you want to write a function to make a small update to an
> existing matrix.  You have no choice but to return a copy and hope
> that Matlab's optimizer is smart enough to eliminate the unnecessary
> copy.

Yah. In D, there's ref.


Andrei