September 16, 2014
On Tuesday, 16 September 2014 at 04:19:45 UTC, Ola Fosheim Grostad wrote:
> On Monday, 15 September 2014 at 23:41:27 UTC, Rainer Schuetze wrote:
>> Thanks for the link, I didn't know a solution exists. I'll have to study the "differential" approach to see if it works for our case and at what cost it comes...
>
> Modern x86 has 128 bit CAS instruction too: lock cmpxchg16b

... which I really need to add to core.atomic.
September 16, 2014
> No, and it neeedn't. The article is not that good. In C++, if a thread must increment a reference counter while it's going to zero due to another thread, that's 100% a programming error, not a concurrency error. That's a well known and well studied problem. As an aside, searching the net for differential reference counting yields pretty much only this article.
>
>
> Andrei

 Alright sounds sensible enough, it does seem like you would have to write some crappy code to trigger it

September 16, 2014
On 9/16/14, 7:22 AM, Sean Kelly wrote:
> On Tuesday, 16 September 2014 at 04:19:45 UTC, Ola Fosheim Grostad wrote:
>> On Monday, 15 September 2014 at 23:41:27 UTC, Rainer Schuetze wrote:
>>> Thanks for the link, I didn't know a solution exists. I'll have to
>>> study the "differential" approach to see if it works for our case and
>>> at what cost it comes...
>>
>> Modern x86 has 128 bit CAS instruction too: lock cmpxchg16b
>
> .... which I really need to add to core.atomic.

Yes please. -- Andrei

September 16, 2014
On 9/15/14, 4:49 PM, Rainer Schuetze wrote:
>
>
> On 15.09.2014 10:24, Andrei Alexandrescu wrote:
>>>
>>> Hmm, seems fine when I try it. It feels like a bug in the type system,
>>> though: when you make a copy of const(RCXString) to some RCXString, it
>>> removes the const from the referenced RCBuffer struct mbuf!?
>>
>> The conversion relies on pure constructors. As I noted in the opening
>> post, I also think there's something too lax in there. If you have a
>> reduced example that shows a type system breakage without cast, please
>> submit.
>
> Here's an example:
>
> module module2;
>
> struct S
> {
>      union
>      {
>          immutable(char)* iptr;
>          char* ptr;
>      }
> }
>
> void main()
> {
>      auto s = immutable(S)("hi".ptr);
>      S t = s;
>      t.ptr[0] = 'A';
> }
>
> It seems the union is hiding the fact that there are mutable references.
> Only the first field is verified when copying the struct. Is this by
> design? (typeof(s.ptr) is "immutable(char*)")

Not sure whether that's a bug or feature :o). In fact I'm not even kidding. The "it's a bug" view is obvious. The "it's a feature" view goes by the reasoning: if you're using a union, it means you plan to do gnarly things with the type system anyway, so the compiler may as well tread carefully around you.

Through a rather interesting coincidence, I was talking to Walter during the weekend about the idiom:

union
{
    immutable T data;
    T mdata;
}

which I found useful for things like incrementing the reference counter for non-mutable data. I was discussing how it would be cool if the compiler recognized the construct and did something interesting about it. It seems it already does.


Andrei

September 16, 2014

On 16.09.2014 00:44, Andrei Alexandrescu wrote:
> On 9/15/14, 10:22 PM, Rainer Schuetze wrote:
>>
>>
>> On 15.09.2014 21:49, Andrei Alexandrescu wrote:
>>> On 9/15/14, 12:43 PM, po wrote:
>>>>
>>>>> As I understand the issue it works if you make sure to transfer
>>>>> ownership explicitly before the other thread gains access?
>>>>>
>>>>> Maybe this is more clear:
>>>>>
>>>>> http://www.1024cores.net/home/lock-free-algorithms/object-life-time-management/differential-reference-counting
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>   Ah, I think I follow.
>>>>
>>>> So in C++ terms:
>>>>   It basically requires either a global shared_ptr, or that you passed
>>>> one around by reference between threads. And that you then killed it in
>>>> one thread at the exact moment the control block was read in another
>>>> thread. That blog post discusses a solution, I wonder if that is
>>>> implemented in C++'s shared_ptr?
>>>
>>> No, and it neeedn't. The article is not that good. In C++, if a thread
>>> must increment a reference counter while it's going to zero due to
>>> another thread, that's 100% a programming error, not a concurrency
>>> error. That's a well known and well studied problem. As an aside,
>>> searching the net for differential reference counting yields pretty much
>>> only this article.

Here is a link with a discussion, links and code:

https://groups.google.com/forum/#!topic/comp.programming.threads/6mXgQEiAOW8

It seems there were multiple patents claiming invention of that technique.

>>
>> Huuh? So you must not read a reference to a ref-counted object that
>> might get changed in another thread?
>
> I didn't say that.
>
>> Maybe you mean destruction of the
>> shared pointer?
>
> I meant: by the time the smart pointer got to the thread, its reference
> count has increased already.
>

This works if you use message passing. The issue exists for "shared(shared_ptr!T)". It might be bad style, but that is a convention not enforced by the language.

Incidentally, Herb Sutter used "shared_ptr<T>" as a means to implement lock-free linked lists in his talk at the CppCon. To avoid issues, the list head has to be "atomic<shared_ptr<T>>". Which currently needs a lock to do an assignment for similar reasons. ;-o He said there might be ways around that...

>> Please note that the scenario is also described by Richard Jones in his
>> 2nd edition of the "Handbook of Garbage Collection" (see algorithm 18.2
>> "Eager reference counting with CompareAndSwap is broken").
>
> I agree such a problem may occur in code generated automatically under
> the wraps for high-level languages, but not with shared_ptr (or COM
> objects etc).

I agree it is worse if the mechanism is hidden by the system, pretending you are dealing with a single pointer. I'm not yet ready to  accept it doesn't exist elsewhere.

Coming back to RCString, immutable(RCString) does not have this problem, because it must not be modified by any thread. Working with shared(RCString) isn't supported without a lot of overloads, so you'll have to synchronize externally and cast away shared.

September 17, 2014
On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
> Walter, Brad, myself, and a couple of others have had a couple of quite exciting ideas regarding code that is configurable to use the GC or alternate resource management strategies. One thing that became obvious to us is we need to have a reference counted string in the standard library. That would be usable with applications that want to benefit from comfortable string manipulation whilst using classic reference counting for memory management. I'll get into more details into the mechanisms that would allow the stdlib to provide functionality for both GC strings and RC strings; for now let's say that we hope and aim for swapping between these with ease. We hope that at one point people would be able to change one line of code, rebuild, and get either GC or RC automatically (for Phobos and their own code).

Ironically, strings have been probably least of my GC-related issues with D so far - hard to evaluate applicability of this proposal because of that. What are typical use cases for such solution? (not questioning its importance, just being curious)
September 17, 2014
On 9/17/14, 9:30 AM, Dicebot wrote:
> Ironically, strings have been probably least of my GC-related issues
> with D so far - hard to evaluate applicability of this proposal because
> of that. What are typical use cases for such solution? (not questioning
> its importance, just being curious)

Simplest is "I want to use D without a GC and suddenly the string support has fallen down to bear claws and silex stones."

RCString should be a transparent (or at least near-transparent) replacement for string in GC-less environments.


Andrei

September 17, 2014
On Wednesday, 17 September 2014 at 16:32:41 UTC, Andrei Alexandrescu wrote:
> On 9/17/14, 9:30 AM, Dicebot wrote:
>> Ironically, strings have been probably least of my GC-related issues
>> with D so far - hard to evaluate applicability of this proposal because
>> of that. What are typical use cases for such solution? (not questioning
>> its importance, just being curious)
>
> Simplest is "I want to use D without a GC and suddenly the string support has fallen down to bear claws and silex stones."
>
> RCString should be a transparent (or at least near-transparent) replacement for string in GC-less environments.
>
>
> Andrei

I think the biggest gc=(partially?)off customers are game makers:

http://forum.dlang.org/thread/k27bh7$t7f$1@digitalmars.com

(check especially the bottom of the 6th page)

Random quote:
"I created a reference counted array which is as close to the native D
array as currently possible (compiler bugs, type system issues, etc).
also in core.refcounted. It however does not replace the default string
or array type in all cases because it would lead to reference counting
in uneccessary places. The focus is to get only reference couting where
absolutly neccessary. I'm still using the standard string type as a
"only valid for current scope" kind of string."

And my fav:
"- You most likely won't like the way I implemented reference counting"

I hope Benjamin Thaut can share his viewpoint on the topic if he is still around.

Piotrek
September 19, 2014
On Wednesday, 17 September 2014 at 16:32:41 UTC, Andrei Alexandrescu wrote:
> On 9/17/14, 9:30 AM, Dicebot wrote:
>> Ironically, strings have been probably least of my GC-related issues
>> with D so far - hard to evaluate applicability of this proposal because
>> of that. What are typical use cases for such solution? (not questioning
>> its importance, just being curious)
>
> Simplest is "I want to use D without a GC and suddenly the string support has fallen down to bear claws and silex stones."
>
> RCString should be a transparent (or at least near-transparent) replacement for string in GC-less environments.

Well this is exactly what I don't understand. Strings we have don't have any strong connection to GC (apart from concatenation which can be verified by @nogc) being just slices to some external buffer. That buffer can be malloc'ed or stack allocated, that doesn't really affect most string processing algorithms, not unless those try to do some re-allocation of their own.

I agree that pipeline approach does not work that well for complex programs in general but strings seem to be best match to it - either you want read-only access or a pipe-line, everything else feels inefficient as amount of write operations gets out of control. Every single attempt to do something clever with shared CoW strings in C++ I have met was a total failure.

That is why I wonder - what kind of applications really need the rcstring as opposed to some generic rcarray?
September 19, 2014
On 9/19/14, 3:32 AM, Dicebot wrote:
> On Wednesday, 17 September 2014 at 16:32:41 UTC, Andrei Alexandrescu wrote:
>> On 9/17/14, 9:30 AM, Dicebot wrote:
>>> Ironically, strings have been probably least of my GC-related issues
>>> with D so far - hard to evaluate applicability of this proposal because
>>> of that. What are typical use cases for such solution? (not questioning
>>> its importance, just being curious)
>>
>> Simplest is "I want to use D without a GC and suddenly the string
>> support has fallen down to bear claws and silex stones."
>>
>> RCString should be a transparent (or at least near-transparent)
>> replacement for string in GC-less environments.
>
> Well this is exactly what I don't understand. Strings we have don't have
> any strong connection to GC (apart from concatenation which can be
> verified by @nogc) being just slices to some external buffer. That
> buffer can be malloc'ed or stack allocated, that doesn't really affect
> most string processing algorithms, not unless those try to do some
> re-allocation of their own.

It does affect management, i.e. you don't know when to free the buffer if slices are unaccounted for. So the design of slices are affected as much as that of the buffer.

> I agree that pipeline approach does not work that well for complex
> programs in general but strings seem to be best match to it - either you
> want read-only access or a pipe-line, everything else feels inefficient
> as amount of write operations gets out of control. Every single attempt
> to do something clever with shared CoW strings in C++ I have met was a
> total failure.

What were the issues?

> That is why I wonder - what kind of applications really need the
> rcstring as opposed to some generic rcarray?

I started with rcstring because (a) it's easier to lift off the ground - no worries about construction/destruction of elements etc. and (b) it's frequent enough to warrant some good testing. Of course there'll be an rcarray!T as well.


Andrei