October 10, 2013
On 10/9/2013 9:45 PM, Manu wrote:
> The are a few problems with mangling the type;

I don't understand that.

> It breaks when you need to interact with libraries.

That's true if the library persists copies of the data. But I think it's doable if the library API is stateless, i.e. 'pure'.

> It's incompatible with struct alignment, and changes the struct size. These are
> very carefully managed properties of structures.

Nobody says there can be only one variant of RefCounted.

> It obscures/complicates generic code.

It seems to not be a problem in C++ with shared_ptr<T>.

> It doesn't deal with circular references, which people keep bringing up as a
> very important problem.

ARC doesn't deal with it automatically, either, it requires the user to insert weak pointers at the right places.

But, if the RefCounted data is actually allocated on the GC heap, an eventual GC sweep will delete them.


> What happens when a library receives a T* arg? Micro managing the ref-count at
> library boundaries sounds like a lot more trouble than manual memory management.

Aside from purity mentioned above, another way to deal with that is to encapsulate uses of a RefCounted data structure so that raw pointers into it are unnecessary.

October 10, 2013
On Thursday, October 10, 2013 08:41:19 Jacob Carlborg wrote:
> On 2013-10-10 06:24, Jonathan M Davis wrote:
> > And given that std.concurrency requires casting to and from shared or immutable in order to pass objects across threads, it seems ilke most of D's concurrency model requires casting to and/or from shared or immutable. The major exception is structs or classes which are shared or synchronized rather than a normal object which is used as shared, and I suspect that that's done fairly rarely at this point. In fact, it seems like the most common solution is to ignore shared altogether and use __gshared, which is far worse than casting to and from shared IMHO.
> 
> Isn't the whole point of std.concurrency that is should only accept "shared" for reference types? If you want to use std.concurrency create a "shared" object in the first place?

You might do that if you're creating the object simply to send it across, but it's frequently the case that the object was created well before it was sent across, and it frequently had to have operations done it other than simply creating it (which wouldn't work if it were shared). So, it often wouldn't make sense for the object being passed to be shared except when being passed. And once it's been passed, it's rarely the case that you want it to be shared. You're usually passing ownership. You're essentially taking a thread-local variable from one thread and making it a thread-local variable on another thread. Unfortunately, the type system does not support the concept of thread ownership (beyond thread-local vs shared), so it's up to the programmer to make sure that no references to the object are kept on the original thread, but there's really no way around that unless you're always creating a new object when you pass it across, which would result in which would usually be a unnecessary copy. So, it becomes like @trusted in that sense.

> > So, it's my impression that being able to consider casting to or from shared as abnormal in code which uses shared is a bit of a pipe dream at this point. The current language design pretty much requires casting when doing much of anything with concurrency.
> 
> There must be a better way to solve this.

I honestly don't think we can solve it a different way without completely redesigning shared. shared is specifically designed such that you have to either cast it way to do anything with it or write all of your code to explicitly work with shared, which is not something that generally makes sense to do unless you're creating a type whose only value is in being shared across threads. Far more frequently, you want to share a type which you would also use normally as a thread-local variable, and that means casting.

- Jonathan M Davis
October 10, 2013
On Thursday, October 10, 2013 00:30:55 Walter Bright wrote:
> > It doesn't deal with circular references, which people keep bringing up as a very important problem.
> 
> ARC doesn't deal with it automatically, either, it requires the user to insert weak pointers at the right places.
> 
> But, if the RefCounted data is actually allocated on the GC heap, an eventual GC sweep will delete them.

That may be true, but if you're using RefCounted because you can't afford the
GC, then using the GC heap with them is not an option, because that could
trigger a sweep, which is precisely what you're trying to avoid. More normal
code may be fine with it but not the folks who can't afford the interruption of
stop the world or any of the other costs that come with the GC. So, if
RefCounted (or a similar type) is going to be used without the GC, it's going
to need some type of weak-ref, even if it's just a normal pointer - though as
you've pointed out, that pretty much throws @safety out the window as no GC is
involved. But since you've arguably already done that by using malloc instead
of the GC anyway, I think that it's debatable how much that matters. However,
the GC would allow for more normal code to not worry about circular references
with RefCounted.

- Jonathan M Davis
October 10, 2013
On 2013-10-10 09:18, Walter Bright wrote:

> 1. Shared data cannot be passed to regular functions.

That I understand.

> 2. Functions that create data structures would have to know in advance
> that they'll be creating a shared object. I'm not so sure this would not
> be an invasive change.

If the function doesn't know it creates shared data it will assume it's not and it won't use any synchronization. Then suddenly someone casts it to "shared" and you're in trouble.

> 3. Immutable data is implicitly shared. But it is not created immutable
> - it is created as mutable data, then set to some state, then cast to
> immutable.

It should be possible to create immutable data in the first place. No cast should be required.

-- 
/Jacob Carlborg
October 10, 2013
On 10/10/2013 09:33 AM, Jonathan M Davis wrote:
> I honestly don't think we can solve it a different way without completely redesigning shared. shared is specifically designed such that you have to either cast it way to do anything with it or write all of your code to explicitly work with shared, which is not something that generally makes sense to do unless you're creating a type whose only value is in being shared across threads. Far more frequently, you want to share a type which you would also use normally as a thread-local variable, and that means casting.
>
> - Jonathan M Davis
+1
October 10, 2013
On 2013-10-10 09:33, Jonathan M Davis wrote:

> You might do that if you're creating the object simply to send it across, but
> it's frequently the case that the object was created well before it was sent
> across, and it frequently had to have operations done it other than simply
> creating it (which wouldn't work if it were shared). So, it often wouldn't
> make sense for the object being passed to be shared except when being passed.

I guess if you're not creating it as "shared" to being with there's not way to tell that the given object now is shared an no thread local references are allowed.

> And once it's been passed, it's rarely the case that you want it to be shared.
> You're usually passing ownership. You're essentially taking a thread-local
> variable from one thread and making it a thread-local variable on another
> thread. Unfortunately, the type system does not support the concept of thread
> ownership (beyond thread-local vs shared), so it's up to the programmer to
> make sure that no references to the object are kept on the original thread,
> but there's really no way around that unless you're always creating a new
> object when you pass it across, which would result in which would usually be a
> unnecessary copy. So, it becomes like @trusted in that sense.

It sounds like we need a way to transfer ownership of an object to a different thread.

> I honestly don't think we can solve it a different way without completely
> redesigning shared. shared is specifically designed such that you have to
> either cast it way to do anything with it or write all of your code to
> explicitly work with shared, which is not something that generally makes sense
> to do unless you're creating a type whose only value is in being shared across
> threads. Far more frequently, you want to share a type which you would also
> use normally as a thread-local variable, and that means casting.

I guess it wouldn't be possible to solve it without changing the type system.

-- 
/Jacob Carlborg
October 10, 2013
On 2013-10-10 09:24, Jonathan M Davis wrote:

> Pretty much nothing accepts shared. At best, templated functions accept
> shared. Certainly, shared doesn't work at all with classes and structs unless
> the type is specifically intended to be used as shared, because you have to
> mark all of its member functions shared to be able to call them. And if you
> want to use that class or struct as both shared and unshared, you have to
> duplicate all of its member functions.
>
> That being the case, the only way in general to use a shared object is to
> protect it with a lock, cast it to thread-local (so that it can actually use
> its member functions or be passed to other functions to be used), and then use
> it. e.g.
>
> synchronized
> {
>       auto tl = cast(T)mySharedT;
>       auto result = tl.foo();
>       auto result2 = bar(tl);
> }
>
> Obviously, you then have to make sure that there are no thread-local
> references to the shared object when the lock is released, but without casting
> away shared like that, you can't do much of anything with it. So, similar to
> when you cast away const, it's up to you to guarantee that the code doesn't
> violate the type system's guarantees - i.e. that a thread-local variable is
> not accessed by multiple threads. So, you use a lock of some kind to protect
> the shared variable while it's treated as a thread-local variable in order to
> ensure that that guarantee holds. Like with casting away const or with
> @trusted, there's obviously risk in doing this, but there's really no other
> way to use shared at this point - certainly not without it being incredibly
> invasive to your code and forcing code duplication.

Sounds like we need a way to tell that a parameter is thread local but not allowed to escape a reference to it.

Object foo;

void bar (shared_tls Object o)
{
    foo = o; // Compile error, cannot escape a "shared" thread local
}

void main ()
{
    auto o = new shared(Object);
    synchronized { bar(o); }
}

Both "shared" can thread local be passed to "shared_tls". If "shared" is passed it assumes to be synchronized during the call to "bar".

This will still have the problem of annotating all code with this attribute. Or this needs to be default, which would cause a lot of code breakage.

-- 
/Jacob Carlborg
October 10, 2013
On 2013-10-10 01:21:25 +0000, "deadalnix" <deadalnix@gmail.com> said:

> On Wednesday, 9 October 2013 at 23:37:53 UTC, Michel Fortin wrote:
>> In an ideal world, we'd be able to choose between using a GC or using ARC when building our program. A compiler flag could do the trick. But that becomes messy when libraries (static and dynamic) get involved as they all have to agree on the same codegen to work together. Adding something to mangling that would cause link errors in case of mismatch might be good enough to prevent accidents though.
> 
> ObjC guys used to think that. It turns out it is a really bad idea.

Things were much worse with Objective-C because at the time there was no ARC, reference-counting was manual and supporting both required a lot of manual work. Supporting the GC wasn't always a easy either, as the GC only tracked pointers inside of Objective-C objects and on the stack, not in structs on the heap. The GC had an implementation problem for pointers inside static segments, and keeping code working on both a GC and reference-counted had many perils.

I think it can be done better in D. We'd basically just be changing the GC algorithm so it uses reference counting. The differences are:

1. unpredictable lifetimes -> predictable lifetime
2. no bother about cyclic references -> need to break them with "weak"

The later is probably the most problematic, but if someone has leaks because he uses a library missing "weak" annotations he can still run the GC to collect them while most memory is reclaimed through ARC, or he can fix the problematic library by adding "weak" at the right places.


-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

October 10, 2013
On 2013-10-10 06:41:19 +0000, Jacob Carlborg <doob@me.com> said:

> On 2013-10-10 06:24, Jonathan M Davis wrote:
> 
>> So, it's my impression that being able to consider casting to or from shared
>> as abnormal in code which uses shared is a bit of a pipe dream at this point.
>> The current language design pretty much requires casting when doing much of
>> anything with concurrency.
> 
> There must be a better way to solve this.

http://michelf.ca/blog/2012/mutex-synchonization-in-d/

-- 
Michel Fortin
michel.fortin@michelf.ca
http://michelf.ca

October 10, 2013
On Thursday, 10 October 2013 at 04:24:31 UTC, Jonathan M Davis wrote:
>> Also, casting _away_ shared is going to be a very common operation due to
>> how shared works.

It is yet another use case for `scope` storage class. Locking `shared` variable via mutex should return same variable but casted to non-shared `scope` (somewhere inside the locking library function). Then it is safe to pass it to functions accepting scope parameters as reference won't possibly escape.