September 15, 2014
On 9/15/14, 8:07 AM, bearophile wrote:
> Andrei Alexandrescu:
>
>> Again, it's become obvious that a category of users will simply refuse
>> to use a GC, either for the right or the wrong reasons. We must make D
>> eminently usable for them.
>
> Is adding reference counted strings to D going to add a significant
> amount of complexity for the programmers?

Time will tell, but I don't think so.

> As usual your judgement is better than mine, but surely the increase in
> complexity of D language and its usage must be considered in this
> rcstring discussion. So far I have not seen this point discussed enough
> in this thread.

Increasing the standard library with good artifacts is important. So is making it more generic by (in this case) expanding the kinds of strings it supports.

> D is currently quite complex, so I prefer enhancements that simplify the
> code (like tuples), or that make it safer (this mostly means type system
> improvements, like eprovably correct tracking of memory areas and
> lifetimes, or stricter types for array indexes, or better means to
> detect errors at compile-times with more compile-time introspection for
> function/ctor arguments), or features that have a limited scope and
> don't increase the general code complexity much (like the partial type
> inference patch created by Kenji).

I think most people exclude the library when discussing the complexity of a language.


Andrei


September 15, 2014
On Monday, 15 September 2014 at 02:26:19 UTC, Andrei Alexandrescu wrote:
> http://dpaste.dzfl.pl/817283c163f5

The test on line 267 fails on a 32-bit build:

rcstring.d(267): Error: cannot implicitly convert expression (38430716820228232L) of type long to uint

Hosting it as a Gist on Github[1] might be an idea, as then the same link will be relevant after the code is updated, and people can post line comments. It doesn't support building and running the code online, but dpaste.dzfl.pl's old FE version (2.065) doesn't support the code anyway.

[1] https://gist.github.com/
September 15, 2014

On 15.09.2014 07:49, Andrei Alexandrescu wrote:
>> I haven't found a single lock, is single threading by design or is
>> thread-safety on your todo?
>
> Currently shared strings are not addressed.

Please also consider usage with const and immutable:

* both will disallow changing the reference count without casting

* immutable means implicitely shared between threads, so you'll have to make RCString thread-safe even if shared isn't explicitly supported.

Unfortunately, I've yet to see an efficient thread-safe implementation of reference counting (i.e. without locks).

VC used to have reference counted strings, but moved away from it. Maybe it doesn't pull its own weight in the face of the small-string-optimization.
September 15, 2014

On 15.09.2014 07:49, Andrei Alexandrescu wrote:
>> I would assume RCString should be faster than string, so could you
>> provide a benchmark of the two.
>
> Good idea. It likely won't be faster for the most part (unless it uses
> realloc and realloc is a lot faster than GC.realloc).

Do you have any benchmarks to share? Last time I measured, the GC is quite a bit faster with manual memory management than the C runtime on Win32 and on par on Win64.
September 15, 2014
On 9/15/14, 8:58 AM, Rainer Schuetze wrote:
>
>
> On 15.09.2014 07:49, Andrei Alexandrescu wrote:
>>> I haven't found a single lock, is single threading by design or is
>>> thread-safety on your todo?
>>
>> Currently shared strings are not addressed.
>
> Please also consider usage with const and immutable:
>
> * both will disallow changing the reference count without casting

I think these work fine. If not, please send examples.

> * immutable means implicitely shared between threads, so you'll have to
> make RCString thread-safe even if shared isn't explicitly supported.

Hmmm, good point. That's a bug. Immutable postblit and dtors should use atomic ops.

> Unfortunately, I've yet to see an efficient thread-safe implementation
> of reference counting (i.e. without locks).

No locks needed, just interlocked ++/--.

> VC used to have reference counted strings, but moved away from it. Maybe
> it doesn't pull its own weight in the face of the
> small-string-optimization.

The reason of C++ strings moving away from refcounting is not strongly related to interlocked refcounting being slow.


Andrei

September 15, 2014
On Monday, 15 September 2014 at 14:44:53 UTC, Andrei Alexandrescu wrote:
> On 9/15/14, 2:50 AM, monarch_dodra wrote:
>>
>> - No way to "GC-dup" the RCString. giving "dup"/"idup" members on RCstring, for when you really just need to revert to pure un-collected GC.
>
> Nice. But then I'm thinking, wouldn't people think .dup produces another RCString?
>
I certainly would.  If I wanted a GC string from an RCString, I'd probably reach for std.conv for clarity's sake.  e.g.

RCString foo = "banana!";
string bar = to!string(foo);

-Wyatt
September 15, 2014

On 15.09.2014 09:22, Andrei Alexandrescu wrote:
> On 9/15/14, 8:58 AM, Rainer Schuetze wrote:
>>
>>
>> On 15.09.2014 07:49, Andrei Alexandrescu wrote:
>>>> I haven't found a single lock, is single threading by design or is
>>>> thread-safety on your todo?
>>>
>>> Currently shared strings are not addressed.
>>
>> Please also consider usage with const and immutable:
>>
>> * both will disallow changing the reference count without casting
>
> I think these work fine. If not, please send examples.
>

Hmm, seems fine when I try it. It feels like a bug in the type system, though: when you make a copy of const(RCXString) to some RCXString, it removes the const from the referenced RCBuffer struct mbuf!?

>> * immutable means implicitely shared between threads, so you'll have to
>> make RCString thread-safe even if shared isn't explicitly supported.
>
> Hmmm, good point. That's a bug. Immutable postblit and dtors should use
> atomic ops.
>
>> Unfortunately, I've yet to see an efficient thread-safe implementation
>> of reference counting (i.e. without locks).
>
> No locks needed, just interlocked ++/--.

Eager reference counting with atomics is not thread safe. See the discussions about automatic reference counting.

>
>> VC used to have reference counted strings, but moved away from it. Maybe
>> it doesn't pull its own weight in the face of the
>> small-string-optimization.
>
> The reason of C++ strings moving away from refcounting is not strongly
> related to interlocked refcounting being slow.

Yes, they did not care for thread safety back then. IIRC they had no small-buffer-optimization. With that, reference counting only kicks in with large strings.

If we need a lock on these for proper reference counting, it's still better than making a copy including a global lock by the allocator.

Rainer
September 15, 2014
On 9/15/14, 9:56 AM, Rainer Schuetze wrote:
> On 15.09.2014 09:22, Andrei Alexandrescu wrote:
>> On 9/15/14, 8:58 AM, Rainer Schuetze wrote:
>>>
>>>
>>> On 15.09.2014 07:49, Andrei Alexandrescu wrote:
>>>>> I haven't found a single lock, is single threading by design or is
>>>>> thread-safety on your todo?
>>>>
>>>> Currently shared strings are not addressed.
>>>
>>> Please also consider usage with const and immutable:
>>>
>>> * both will disallow changing the reference count without casting
>>
>> I think these work fine. If not, please send examples.
>>
>
> Hmm, seems fine when I try it. It feels like a bug in the type system,
> though: when you make a copy of const(RCXString) to some RCXString, it
> removes the const from the referenced RCBuffer struct mbuf!?

The conversion relies on pure constructors. As I noted in the opening post, I also think there's something too lax in there. If you have a reduced example that shows a type system breakage without cast, please submit.

>>> * immutable means implicitely shared between threads, so you'll have to
>>> make RCString thread-safe even if shared isn't explicitly supported.
>>
>> Hmmm, good point. That's a bug. Immutable postblit and dtors should use
>> atomic ops.
>>
>>> Unfortunately, I've yet to see an efficient thread-safe implementation
>>> of reference counting (i.e. without locks).
>>
>> No locks needed, just interlocked ++/--.
>
> Eager reference counting with atomics is not thread safe. See the
> discussions about automatic reference counting.

I'm not sure about that discussion, but there's good evidence from C++ that refcounting with atomics works. What was the smoking gun?


Andrei

September 15, 2014
On Monday, 15 September 2014 at 17:23:32 UTC, Andrei Alexandrescu
wrote:
> I'm not sure about that discussion, but there's good evidence from C++ that refcounting with atomics works. What was the smoking gun?

http://www.gotw.ca/gotw/045.htm
September 15, 2014
>> I'm not sure about that discussion, but there's good evidence from C++ that refcounting with atomics works. What was the smoking gun?
>
> http://www.gotw.ca/gotw/045.htm

 I don't see how that link answers Andrei's question? He just compares different methods of implementing COW.