November 03, 2010
One other benefit atomic reference counting would have is that it would allow sharing of reference counted data structures across threads without worrying about invisible race conditions caused by the reference counting. Furthermore, you could still pass by reference and use similar tricks to avoid the overhead of the atomic reference counting just like you would for an object with arbitrarily expensive copying.  You just wouldn't have to bend over backwards to avoid it.

On Wed, Nov 3, 2010 at 11:00 AM, Max Samukha <maxsamukha at gmail.com> wrote:

> These notes on copy-constructible Qt types may be useful for the discussion:
>
> 1. 1/5 (approx. 100 classes) of all classes in core, gui, network, webkit, svg and opengl packages define public copy-constructors.
>
> A half of those are in reference-counted COW types (approx. 50 classes). The remaining 50 classes are reference-counted types without COW, types with allocating copy-constructors and types with trivial non-allocating constructors.
>
> Most of the types with allocating copy-constructors I would probably implemented as classes in D. Polymorphic types like QListWidgetItem that provide the copy-constructor only for clone() reimplementation should definitely be classes in D.
>
> The conclusion I tend to draw is that expensive copy-constructors can be avoided in most applications.
>
> 2. Reference counters are atomic using interlocked integer operations.
> There have been a couple of complaints about performance [
> http://www.gotw.ca/publications/optimizations.htm] but those complaints
> seem ungrounded [
> http://labs.qt.nokia.com/2006/10/16/atomic-reference-counting-is-it-worth-it-2
> ].
>
> 3. If a type exposes data by reference, proxy objects are used to avoid unnecessary copying. For example, QString::operator[](int) returns an instance of Q?harRef, not QChar&. The shared data is copied only if the QCharRef instance is modified.
>
>
> On Sun, Oct 31, 2010 at 5:57 AM, Andrei Alexandrescu <andrei at erdani.com>wrote:
>
>> I am highly interested in the opinion of Phobos contributors in the matter of copy construction (just posted the message below).
>>
>> Andrei
>>
>> -------- Original Message --------
>> Subject: Re: Ruling out arbitrary cost copy construction?
>> Date: Sat, 30 Oct 2010 22:56:24 -0500
>> From: Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org>
>> Newsgroups: digitalmars.D
>>
>> On 10/30/2010 09:40 PM, Michel Fortin wrote:
>>
>>> On 2010-10-30 20:49:38 -0400, Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> said:
>>>
>>>  On 10/30/10 2:24 CDT, Don wrote:
>>>>
>>>>> At the moment, I think it's impossible.
>>>>> Has anyone succesfully implemented refcounting in D? As long as bug
>>>>> 3516
>>>>> (Destructor not called on temporaries) remains open, it doesn't seem to
>>>>> be possible.
>>>>> Is that the only blocker, or are there others?
>>>>>
>>>>
>>>> I managed to define and use RefCounted in Phobos. File also uses hand-made reference counting. I think RefCounted is a pretty good abstraction (unless it hits the bug you mentioned.)
>>>>
>>>
>>> I like the idea of RefCounted as a way to automatically make things reference counted.
>>>
>>
>> Unfortunately it's only a semi-automated mechanism.
>>
>>  But like File and many similar ref-counted structs, it has this race
>>> condition (bug 4624) when stored inside the GC heap. Currently, most of Phobos's ref-counted structs are race-free only when they reside on the stack or if your program has only one thread (because the GC doesn't spawn threads if I'm correct).
>>>
>>> It's a little sad that the language doesn't prevent races in destructors
>>> (bug 4621).
>>>
>>
>> I hope we're able to solve these implementation issues that can be seen as independent from the decision at hand.
>>
>> Walter and I discussed the matter again today and we're on the brink of deciding that cheap copy construction is to be assumed. This simplifies the language and the library a great deal, and makes it perfectly good for 95% of the cases. For a minority of types, code would need to go through extra hoops (e.g. COW, refcounting) to be compliant.
>>
>> I'm looking for more feedback from the larger D community. This is a very important decision that marks one of the largest departures from the C++ style. Taking the wrong turn here could alienate many programmers coming from C++.
>>
>> So, everybody - this is really the time to speak up or forever be silent.
>>
>>
>> Andrei
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101103/644e85cf/attachment-0001.html>
November 03, 2010
Thanks for the hard data! This is very useful.

The article about atomic reference counting is interesting but mostly compares eager copy with atomic reference counting (I attach one of the result files for convenience). We're also interested in the penalty exacted by atomicity of reference counting when compared with unprotected reference counting. (The trick of returning a smart proxy is clever but C++ specific.) Also, if I understand things correctly, the article does not induce contention; it only measures uncontested operations.

Anyhow, the analysis of the class constructors in Qt is extremely relevant. Thanks again!


Andrei

On 11/3/10 10:00 AM, Max Samukha wrote:
> These notes on copy-constructible Qt types may be useful for the discussion:
>
> 1. 1/5 (approx. 100 classes) of all classes in core, gui, network, webkit, svg and opengl packages define public copy-constructors.
>
> A half of those are in reference-counted COW types (approx. 50 classes). The remaining 50 classes are reference-counted types without COW, types with allocating copy-constructors and types with trivial non-allocating constructors.
>
> Most of the types with allocating copy-constructors I would probably implemented as classes in D. Polymorphic types like QListWidgetItem that provide the copy-constructor only for clone() reimplementation should definitely be classes in D.
>
> The conclusion I tend to draw is that expensive copy-constructors can be avoided in most applications.
>
> 2. Reference counters are atomic using interlocked integer operations.
> There have been a couple of complaints about performance
> [http://www.gotw.ca/publications/optimizations.htm] but those complaints
> seem ungrounded
> [http://labs.qt.nokia.com/2006/10/16/atomic-reference-counting-is-it-worth-it-2].
>
> 3. If a type exposes data by reference, proxy objects are used to avoid unnecessary copying. For example, QString::operator[](int) returns an instance of Q?harRef, not QChar&. The shared data is copied only if the QCharRef instance is modified.
>
> On Sun, Oct 31, 2010 at 5:57 AM, Andrei Alexandrescu <andrei at erdani.com <mailto:andrei at erdani.com>> wrote:
>
>     I am highly interested in the opinion of Phobos contributors in the
>     matter of copy construction (just posted the message below).
>
>     Andrei
>
>     -------- Original Message --------
>     Subject: Re: Ruling out arbitrary cost copy construction?
>     Date: Sat, 30 Oct 2010 22:56:24 -0500
>     From: Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org
>     <mailto:SeeWebsiteForEmail at erdani.org>>
>     Newsgroups: digitalmars.D
>
>     On 10/30/2010 09:40 PM, Michel Fortin wrote:
>
>         On 2010-10-30 20:49:38 -0400, Andrei Alexandrescu
>         <SeeWebsiteForEmail at erdani.org
>         <mailto:SeeWebsiteForEmail at erdani.org>> said:
>
>             On 10/30/10 2:24 CDT, Don wrote:
>
>                 At the moment, I think it's impossible.
>                 Has anyone succesfully implemented refcounting in D? As
>                 long as bug 3516
>                 (Destructor not called on temporaries) remains open, it
>                 doesn't seem to
>                 be possible.
>                 Is that the only blocker, or are there others?
>
>
>             I managed to define and use RefCounted in Phobos. File also uses
>             hand-made reference counting. I think RefCounted is a pretty
>             good
>             abstraction (unless it hits the bug you mentioned.)
>
>
>         I like the idea of RefCounted as a way to automatically make things
>         reference counted.
>
>
>     Unfortunately it's only a semi-automated mechanism.
>
>         But like File and many similar ref-counted structs, it has this race
>         condition (bug 4624) when stored inside the GC heap. Currently,
>         most of
>         Phobos's ref-counted structs are race-free only when they reside
>         on the
>         stack or if your program has only one thread (because the GC doesn't
>         spawn threads if I'm correct).
>
>         It's a little sad that the language doesn't prevent races in
>         destructors
>         (bug 4621).
>
>
>     I hope we're able to solve these implementation issues that can be seen
>     as independent from the decision at hand.
>
>     Walter and I discussed the matter again today and we're on the brink of
>     deciding that cheap copy construction is to be assumed. This simplifies
>     the language and the library a great deal, and makes it perfectly good
>     for 95% of the cases. For a minority of types, code would need to go
>     through extra hoops (e.g. COW, refcounting) to be compliant.
>
>     I'm looking for more feedback from the larger D community. This is a
>     very important decision that marks one of the largest departures from
>     the C++ style. Taking the wrong turn here could alienate many
>     programmers coming from C++.
>
>     So, everybody - this is really the time to speak up or forever be
>     silent.
>
>
>     Andrei
>     _______________________________________________
>     phobos mailing list
>     phobos at puremagic.com <mailto:phobos at puremagic.com>
>     http://lists.puremagic.com/mailman/listinfo/phobos
>
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: results-rayon.txt
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101103/409a7955/attachment.txt>
November 03, 2010
Le 2010-11-03 ? 11:00, Max Samukha a ?crit :

> These notes on copy-constructible Qt types may be useful for the discussion:
> 
> 1. 1/5 (approx. 100 classes) of all classes in core, gui, network, webkit, svg and opengl packages define public copy-constructors.
> 
> A half of those are in reference-counted COW types (approx. 50 classes). The remaining 50 classes are reference-counted types without COW, types with allocating copy-constructors and types with trivial non-allocating constructors.
> 
> Most of the types with allocating copy-constructors I would probably implemented as classes in D. Polymorphic types like QListWidgetItem that provide the copy-constructor only for clone() reimplementation should definitely be classes in D.

This prompted me at looking into how reference counting works in Cocoa. Note that all Cocoa objects use reference semantics and are reference counted.

The reference counter is not stored as part of the object. Instead, reference counters are scattered among 8 global hash tables based on bits 8,9,10 of the pointer value. Each table has its own spinlock which is used for synchronization. On embeded system (iOS presumably), there is only one hash table instead of 8. I guess having many tables is worthless if you have only one core.

I haven't done any benchmarking, but it seems the assumption is that each thread will use a different memory region that will hopefully fall in a different hash table and thus under a different spinlock and cacheline, which should reduce contention.

Relevant source code (under the Apple Public Source License):

See function __CFDoExternRefOperation in <http://www.opensource.apple.com/source/CF/CF-550/CFRuntime.c>

Spinlock implementation it uses: <http://google.com/codesearch/p?hl=fr#pFm0LxzAWvs/darwinsource/tarballs/apsl/Libc-391.tar.gz%7Cz8mNFiEo9vA/Libc-391/i386/sys/OSAtomic.s&q=OSSpinLock%20package:darwin>


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/



November 04, 2010
(2010/11/03 2:47), Michel Fortin wrote:
>
> Le 2010-11-02 ? 13:05, SHOO a ?crit :
>
>>:
>>:
>>:
>>
>> Another viewpoint.
>>
>> Is SealedRange really appropriate?
>>
>> All these are caused by the same problem:
>> - http://ideone.com/x1Zus
>> - http://ideone.com/iM18Q
>> - http://ideone.com/TTin3
>> - http://ideone.com/x4b0o
>>
>> We should consider that we grope the common solution for these problems.
>> It is the method that block the access to reference data of which instance was deleted.
>
> In first and second examples, you're taking the address of a local variable. This is forbidden in @safe code, so I consider this already solved.

I don't think so.
@safe code cannot forbid to bring out the address:
http://ideone.com/rMl5i

@safe only forbid pointer operation: http://ideone.com/8nWRP

So, this problem does not show solution, yet.



> In the third example, you're coping the sealed range into "r", then "ary" goes out of scope and delete the memory block while keeping a sealed range pointing to it. My take is that the implementation of Array is wrong: it should do *something* to either 1) prevent the sealed range from accessing the memory block, or 2) keep the memory block alive as long as a sealed range exists. Note that you can't write such an Array in @safe mode, and not using in @safe mode assumes you know what you're doing when it comes to handling memory. So I don't see that as a problem.

Even if my implementation is wrong, it is a problem that I can write such an implementation. I guess that the fundamental solution is not provided in SealedRange.



> As for fourth example, it uses scope classes, which will be deprecated in D2.

Yeah, this example is a main reason of the deprecation.
But, I think it is not necessary to make it deprecated if there is
another method to avoid this. The method might dispense with SealedRange
and solve problem copy construction cost.

Rather I am strong in interest about this problem.
I think that RAII is a main reason that a constructor and a copy
constructor and a destructor were added to struct.
I suspect that it is the root of all evils that D cannot handle RAII well.



> Also take note that in second and third examples, the struct destructor is dangerous because it calls delete on a GC-allocated array "T[] a". As a general rule, you shouldn't access GC-allocated memory inside a destructor, this includes deleting it. That's because if you ever use this Array struct somewhere on the heap (like in a class), the GC could collect both the struct and the "T[] a" array at the same time, in whatever order, and you would calling the array's destructor twice (once by the GC + once in Array's destructor). But don't worry, you're not alone doing this mistake; I believe there is two instances of this in Phobos.

Ah... OK, you're right. It is not mentioned with the specifications (the point is mentioned by only class destructor specification), but I understand the thought.
November 03, 2010
On Wednesday, November 03, 2010 10:09:03 SHOO wrote:
> (2010/11/03 2:47), Michel Fortin wrote:
> > Le 2010-11-02 ? 13:05, SHOO a ?crit :
> >> Another viewpoint.
> >> 
> >> Is SealedRange really appropriate?
> >> 
> >> All these are caused by the same problem:
> >> - http://ideone.com/x1Zus
> >> - http://ideone.com/iM18Q
> >> - http://ideone.com/TTin3
> >> - http://ideone.com/x4b0o
> >> 
> >> We should consider that we grope the common solution for these problems. It is the method that block the access to reference data of which instance was deleted.
> > 
> > In first and second examples, you're taking the address of a local variable. This is forbidden in @safe code, so I consider this already solved.
> 
> I don't think so.
> @safe code cannot forbid to bring out the address:
> http://ideone.com/rMl5i
> 
> @safe only forbid pointer operation: http://ideone.com/8nWRP

If you're basing that on what the compiler currently complains about, then I'm not sure that that's a very good measurement since @safe, @trusted, and @system still need a fair bit of work, as I understand it, before they're really going to be correct. So, even if you can currently do something in @safe mode, that doesn't mean that you should be able to, and if you can't do something in @safe mode, that doesn't necessarily mean that you're not supposed to be able to either.

> Rather I am strong in interest about this problem.
> I think that RAII is a main reason that a constructor and a copy
> constructor and a destructor were added to struct.
> I suspect that it is the root of all evils that D cannot handle RAII well.

I don't know if it's the root of _all_ evil, but I agree that it's a serious problem. Unfortunately, we'd probably need to change the language so that a struct's init wasn't completely known at compile-time but rather was partially computed at runtime if we want to _really_ solve the problem, and I question that we'll ever get Walter to sign off on that one. However, the lack of default constructors for structs is one of - if not _the_ - largest design flaw in the language IMO.

- Jonathan M Davis
November 03, 2010



----- Original Message ----
> From: SHOO <zan77137 at nifty.com>
> 
> (2010/11/03 2:47), Michel Fortin wrote:
> > In  first and second examples, you're taking the address of a local variable.
>This  is forbidden in @safe code, so I consider this already solved.
> 
> I don't  think so.
> @safe code cannot forbid to bring out the  address:
> http://ideone.com/rMl5i

I believe the plan is for @safe to prevent you from taking the address of any stack data.  But I don't know if that is formalized.  Surely, that code should be disabled by @safe, because the compiler can deduce the variable escapes its scope.

-Steve




November 04, 2010
On Wed, Nov 3, 2010 at 5:20 PM, Andrei Alexandrescu <andrei at erdani.com>wrote:

> Thanks for the hard data! This is very useful.
>
> The article about atomic reference counting is interesting but mostly compares eager copy with atomic reference counting (I attach one of the result files for convenience). We're also interested in the penalty exacted by atomicity of reference counting when compared with unprotected reference counting.


I want to know that too. I intend to do some tests later this week if I have time.


> (The trick of returning a smart proxy is clever but C++ specific.)


If we cannot use that trick in D, then we need an alternative. Otherwise, COW is less attractive.


> Also, if I understand things correctly, the article does not induce contention; it only measures uncontended operations.
>

I think we can ignore the case of contention when more than one thread are modifying multiple instances sharing the same data. Such modifications inevitably lead to allocation of a copy in each contending thread, and creating the copies is a vastly more expensive operation than contended operations on the reference counter.

Cases of contended copy-construction, assignment and destruction are more interesting.


>
> Anyhow, the analysis of the class constructors in Qt is extremely relevant. Thanks again!
>

You are welcome.


> Andrei
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101104/9043be34/attachment.html>
November 04, 2010
On Wed, Nov 3, 2010 at 6:28 PM, Michel Fortin <michel.fortin at michelf.com>wrote:

> Le 2010-11-03 ? 11:00, Max Samukha a ?crit :
>
> > These notes on copy-constructible Qt types may be useful for the discussion:
> >
> > 1. 1/5 (approx. 100 classes) of all classes in core, gui, network,
> webkit,
> > svg and opengl packages define public copy-constructors.
> >
> > A half of those are in reference-counted COW types (approx. 50 classes).
> The
> > remaining 50 classes are reference-counted types without COW, types with allocating copy-constructors and types with trivial non-allocating constructors.
> >
> > Most of the types with allocating copy-constructors I would probably implemented as classes in D. Polymorphic types like QListWidgetItem that provide the copy-constructor only for clone() reimplementation should definitely be classes in D.
>
> This prompted me at looking into how reference counting works in Cocoa. Note that all Cocoa objects use reference semantics and are reference counted.
>
> The reference counter is not stored as part of the object. Instead, reference counters are scattered among 8 global hash tables based on bits 8,9,10 of the pointer value. Each table has its own spinlock which is used for synchronization. On embeded system (iOS presumably), there is only one hash table instead of 8. I guess having many tables is worthless if you have only one core.
>
> I haven't done any benchmarking, but it seems the assumption is that each thread will use a different memory region that will hopefully fall in a different hash table and thus under a different spinlock and cacheline, which should reduce contention.
>
> Relevant source code (under the Apple Public Source License):
>
> See function __CFDoExternRefOperation in <http://www.opensource.apple.com/source/CF/CF-550/CFRuntime.c>
>
> Spinlock implementation it uses:
> <
> http://google.com/codesearch/p?hl=fr#pFm0LxzAWvs/darwinsource/tarballs/apsl/Libc-391.tar.gz%7Cz8mNFiEo9vA/Libc-391/i386/sys/OSAtomic.s&q=OSSpinLock%20package:darwin
> >
>
>
An interesting approach. I wonder how well it performs compared to traditional reference counters.


>
> --
> Michel Fortin
> michel.fortin at michelf.com
> http://michelf.com/
>
>
>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20101104/ee94eaab/attachment.html>
November 04, 2010
As for this matter, the direct relations seem to be surely thin in
SealedRange as you say. However, I think that there is relation for the
situation that SealedRange is needed and the lifetime of data closely.
It reduces a use opportunity of SealedRange to make the situation that
cannot access dead data.
I think that copy constructor and move are not regarded as important in
the situation that can expose reference data without worry of to access
the dead data.

--
SHOO


(2010/11/03 3:08), Steve Schveighoffer wrote:
> All of these have nothing to do with sealed ranges.  They have to do with data lifetime.  If you destroy data and then try to access it, you will have problems with or without sealed ranges.  In addition, only one instance you show has a sealed range in it.  The rest either don't have ranges or are not sealed.
>
> Having a sealed range does not necessarily mean that the user cannot shoot themselves in the foot and delete the data while still maintaining references to it.  This problem is mostly solved by having the data or at least the pointer to the data on the heap, destroyed by the GC when no longer needed.
>
> Even then, this does not rule out the user calling clear on the pointer.
>
> SealedRange is not a magic bullet for memory issues, and all these issues exist with or without sealed ranges or expensive copy construction.
>
> What SealedRange *does* do is allow the container to have complete control over the storage for the data.  This means it can:
>
> a) intercept all accesses to the memory, including writes and reads. b) use abstracted allocation schemes.
>
> -Steve
>
>
>
> ----- Original Message ----
>> From: SHOO<zan77137 at nifty.com>
>> Another viewpoint.
>>
>> Is  SealedRange really appropriate?
>>
>> All these are caused by the same  problem:
>> - http://ideone.com/x1Zus
>> - http://ideone.com/iM18Q
>> -  http://ideone.com/TTin3
>> - http://ideone.com/x4b0o
>>
>> We should consider  that we grope the common solution for these problems.
>> It is the method that  block the access to reference data of which
>> instance was  deleted.
>> _______________________________________________
>> phobos mailing  list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>

November 04, 2010
(2010/11/04 2:37), Jonathan M Davis wrote:
> On Wednesday, November 03, 2010 10:09:03 SHOO wrote:
>> (2010/11/03 2:47), Michel Fortin wrote:
>>> :
>>> :
>>> :
>>
>> I don't think so.
>> @safe code cannot forbid to bring out the address:
>> http://ideone.com/rMl5i
>>
>> @safe only forbid pointer operation: http://ideone.com/8nWRP
>
> If you're basing that on what the compiler currently complains about, then I'm not sure that that's a very good measurement since @safe, @trusted, and @system still need a fair bit of work, as I understand it, before they're really going to be correct. So, even if you can currently do something in @safe mode, that doesn't mean that you should be able to, and if you can't do something in @safe mode, that doesn't necessarily mean that you're not supposed to be able to either.
>

I cannot judge whether or not @safe should admit the copy of the pointer. However, I think that this argument is significant.

- http://ideone.com/bJJA3
- http://ideone.com/MsP5V
- http://ideone.com/2gwBs
- etc.

Some examples seems to promote troublesomeness.

>> Rather I am strong in interest about this problem.
>> I think that RAII is a main reason that a constructor and a copy
>> constructor and a destructor were added to struct.
>> I suspect that it is the root of all evils that D cannot handle RAII well.
>
> I don't know if it's the root of _all_ evil, but I agree that it's a serious problem. Unfortunately, we'd probably need to change the language so that a struct's init wasn't completely known at compile-time but rather was partially computed at runtime if we want to _really_ solve the problem, and I question that we'll ever get Walter to sign off on that one. However, the lack of default constructors for structs is one of - if not _the_ - largest design flaw in the language IMO.
>
> - Jonathan M Davis

I agree with you that, '_all_' may be exaggerated?:)

Surely, I want the default constructer of struct very much.
Otherwise we should avoid handling an object by a struct. Because there
is the function of that purpose in class.
'final scope class' seems to be an attractive name.

--
SHOO