shared arrays (page 3) - D Programming Language Discussion Forum

Forums

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » Concurrency » shared arrays (page 3)

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Andrei Alexandrescu
in reply to Sean Kelly

Andrei Alexandrescu

Posted in reply to Sean Kelly

Sean Kelly wrote:
> D allows non utf-8 data in a char[], so I don't see any reason for it to try and guarantee any meaningful result from such an operation.

This is exactly what I wanted to say in my previous post, but much clearer.

Andrei

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Andrei Alexandrescu
in reply to Sean Kelly

Andrei Alexandrescu

Posted in reply to Sean Kelly

Sean Kelly wrote:
> Earlier, I had been thinking it might be nice to have this though:
> 
> shared(char)[] a, b;
> 
> synchronized( lock( a, b ) ) { // some fancy algorithm on a and b }
> 
> Basically, use the hashtable of mutexes discussed earlier to allow users to obtain locks on a set of N arrays in a safe manner (because expecting them to do it manually will generally result in deadlock). This makes what's happening explicit and allows the whole mess to be handled in library code.  In theory, this same approach could work for any reference type.  The optimization issue would be making gc_query() not need to obtain the GC lock to return a valid result (this may be safe already, I haven't spent the time to figure it out).

Defining a hashtable of mutexes would be an interesting idea, but it has many aspects we need to think about. One extreme approach would be to always lock a region of memory, not a particular class object. In that case the mutex pointer stored in each class object could disappear, and you'd essentially just say "lock this guy/these guys" and the system would take care of locking the pages in which those guys reside. It might also reduce deadlock risks, but increase contention because of coarser granularity.

Anyhow, I think it would be quite a rush to go with this for D2.

Andrei

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Sean Kelly
in reply to Andrei Alexandrescu

Sean Kelly

Posted in reply to Andrei Alexandrescu

On Jan 14, 2010, at 1:05 PM, Andrei Alexandrescu wrote:

> Sean Kelly wrote:
>> Earlier, I had been thinking it might be nice to have this though:
>> shared(char)[] a, b;
>> synchronized( lock( a, b ) ) { // some fancy algorithm on a and b }
>> Basically, use the hashtable of mutexes discussed earlier to allow
>> users to obtain locks on a set of N arrays in a safe manner (because
>> expecting them to do it manually will generally result in deadlock).
>> This makes what's happening explicit and allows the whole mess to be
>> handled in library code.  In theory, this same approach could work
>> for any reference type.  The optimization issue would be making
>> gc_query() not need to obtain the GC lock to return a valid result
>> (this may be safe already, I haven't spent the time to figure it
>> out).
> 
> Defining a hashtable of mutexes would be an interesting idea, but it has many aspects we need to think about. One extreme approach would be to always lock a region of memory, not a particular class object. In that case the mutex pointer stored in each class object could disappear, and you'd essentially just say "lock this guy/these guys" and the system would take care of locking the pages in which those guys reside. It might also reduce deadlock risks, but increase contention because of coarser granularity.
> 
> Anyhow, I think it would be quite a rush to go with this for D2.

Yeah definitely.  Since this would be a library thing there's no rush.

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Steve Schveighoffer
in reply to Sean Kelly

Steve Schveighoffer

Posted in reply to Sean Kelly

----- Original Message ----
> From: Sean Kelly <sean at invisibleduck.org>
> 
> Hm... I hadn't planned to add a wait() call for stuff exposed by spawn, but I suppose it's a logical extension of watch(tid) (ie. "please notify me when this thread exits"), which we were going to provide.  Adding a wait() wrapper for this would be trivial.

I wasn't requesting that, I just didn't know the planned API :)  It was pseudocode.

> > 
> On Jan 14, 2010, at 12:20 PM, Steve Schveighoffer wrote:
> > The problem is, the compiler doesn't know with an array of items whether it's
> the array that must be atomic or the elements that must be atomic, or some other relationship (such as a group of elements are related as in utf-8 code points). It should either refuse copying any data, or allow copying any data.  Making a decision based on assumptions of the array semantic meaning doesn't seem right to me.
> 
> D allows non utf-8 data in a char[], so I don't see any reason for it to try and guarantee any meaningful result from such an operation.

In all cases I've seen, D tries to treat char[]'s as string types.  I don't see why shared(char)[] types should be any different.  In the context of strings, the whole thing is the type, not the individual characters.  To treat it differently is to ask for trouble.  I've also seen many mentions of "don't use char for anything but utf-8 data.  Use ubyte for everything else."

I don't have a problem with the compiler allowing copying of strings, even if it results in weird data.  But it should be consistent and allow copying of other array types too (or any size struct).  In other words, the compiler should not make assumptions -- it admits that it's unsure what you want and based on those grounds either a) it allows you to copy anything (I guess I trust you, you are the programmer) or b) it refuses to copy anything (I assume you don't know what you are doing unless you use casting or have locked something).

>  Earlier, I had been
> thinking it might be nice to have this though:
> 
> shared(char)[] a, b;
> 
> synchronized( lock( a, b ) ) {
>     // some fancy algorithm on a and b
> }
>
> Basically, use the hashtable of mutexes discussed earlier to allow users to obtain locks on a set of N arrays in a safe manner (because expecting them to do it manually will generally result in deadlock).  This makes what's happening explicit and allows the whole mess to be handled in library code.  In theory, this same approach could work for any reference type.  The optimization issue would be making gc_query() not need to obtain the GC lock to return a valid result (this may be safe already, I haven't spent the time to figure it out).

If the compiler only allowed operations on arrays if there was a lock held, I would be OK with that too.  That goes nicely with the "refuses to copy anything" idea.

The tough part then becomes, how do you associate a lock with data.  That is, let's say I have something like:

class A
{
    long[] myArray;
    synchronized foo() {...}
}

shared(long)[] globalarray;

void main()
{
   auto a = new shared(A);
   synchronized(a) globalarray = a.myArray;
   ...
}

Now, can two threads simultaneously access the data used by a.myArray and globalarray?  If so, then does that mean that A.foo must also acquire the 'magic array' lock on myArray?  It looks to me like you would have to, which kind of defeats the purpose of having to lock a to access myArray.

-Steve

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Steve Schveighoffer
in reply to Andrei Alexandrescu

Steve Schveighoffer

Posted in reply to Andrei Alexandrescu



----- Original Message ----
> From: Andrei Alexandrescu <andrei at erdani.com>
> To: Discuss the concurrency model(s) for D <dmd-concurrency at puremagic.com>
> Sent: Thu, January 14, 2010 3:57:12 PM
> Subject: Re: [dmd-concurrency] shared arrays
> 
> I understand. The thing is, shared means "can be changed by another thread at any time" so I think the behavior you mention is 100% expectable.
> 
> If you want to do away with shared, you need to add synchronization.
> 
> I agree that the existence of multibyte characters constitute an argument in favor of disabling certain operation. But as long as I can mess up a string in one thread, I will be able to mess it up in several.

I just don't understand the dual standard.

Let's forget about long, and talk about arrays or any struct larger than a single int (but whose individual pieces are at most int size).  Why is it ok to copy a partially-modified string which is considered a whole unit, but it is not OK to copy a partially modified struct with two ints in it?  It's not like you can tear the ints, setting each is atomic.  It looks like the same issue to me, but the compiler refuses to compile one and happily accepts the other.

More inconsistency: I can currently cast a long[] into a ubyte[], so it would follow that I can cast a shared(long)[] into a shared(ubyte)[], no?  Then the compiler lets me copy the ubyte but not the long?

Any decision would be fine with me if the compiler was consistent for any aggregate type regardless of size -- either refuse to do it or allow it.

-Steve

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Kevin Bealer
in reply to Sean Kelly

Kevin Bealer

Posted in reply to Sean Kelly

I'd suggest setting the high bit of the length instead.  If I have this scenario:

char[] foo = "abcd".dup;
char[] bar = foo[1..2];
Then bar is an array with an odd address.

Kevin
On Thu, Jan 14, 2010 at 10:16 AM, Sean Kelly <sean at invisibleduck.org> wrote:

> On Jan 14, 2010, at 4:10 AM, Steve Schveighoffer wrote:
>
> > Having implemented the array append patch to fix stomping, and reading
> the dmd-concurrency debate, I realized that what I implemented is no good for shared arrays.
> >
> > In fact, I wonder how shared arrays can support any array operations
>
> I sent an email about this a while back.  In short, if we're going to allow
> array ops on shared arrays at all I think they'll have to use atomic ops to
> "lock" the array for the length of the update.  Basically, set the 1 bit of
> the ptr field to indicate the array is locked.  This gets tricky when more
> than one shared array is involved because the lock has to be acquired on
> each, and because different ops may try to lock arrays in different orders,
> if a spinlock acquire times out the code will have to release all the locks
> it's acquired, wait some random interval, and try again.  Pretty complicated
> stuff if the goal is just to support shared array ops.
>  _______________________________________________
> dmd-concurrency mailing list
> dmd-concurrency at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/dmd-concurrency/attachments/20100114/41a56e2c/attachment.htm>

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Andrei Alexandrescu
in reply to Steve Schveighoffer

Andrei Alexandrescu

Posted in reply to Steve Schveighoffer

Steve Schveighoffer wrote:
> ----- Original Message ----
>> From: Andrei Alexandrescu <andrei at erdani.com> To: Discuss the concurrency model(s) for D <dmd-concurrency at puremagic.com> Sent: Thu, January 14, 2010 3:57:12 PM Subject: Re: [dmd-concurrency] shared arrays
>> 
>> I understand. The thing is, shared means "can be changed by another thread at any time" so I think the behavior you mention is 100% expectable.
>> 
>> If you want to do away with shared, you need to add synchronization.
>> 
>> I agree that the existence of multibyte characters constitute an argument in favor of disabling certain operation. But as long as I can mess up a string in one thread, I will be able to mess it up in several.
> 
> I just don't understand the dual standard.
> 
> Let's forget about long, and talk about arrays or any struct larger than a single int (but whose individual pieces are at most int size). Why is it ok to copy a partially-modified string which is considered a whole unit, but it is not OK to copy a partially modified struct with two ints in it?  It's not like you can tear the ints, setting each is atomic.  It looks like the same issue to me, but the compiler refuses to compile one and happily accepts the other.

You are right. Let's first consider:

struct A { int a, b; }

shared A x, y;
x = y;          // yay or nay?

I think this should be allowed. This is because A makes no attempt to protect its data by e.g. defining a copy constructor. Anyone could modify a, b in any way - so why not different threads?

Case two. Consider:

struct B {
     int a, b;
     this(this) {
        ...
     }
}

In this case code could still modify a and b arbitrarily, but the type signals that it wants to do something special upon copying. I'd argue that shared Bs should not be copyable. To become copyable, they should do this:

struct B {
     int a, b;
     this(this) {
        ...
     }
     this(this) shared {
        ...
     }
}

(I'm not worried about code duplication as the body is different more often than not.)

Finally, consider this case:

struct C {
     int a;
     long b;
     ...
}

I'd argue that shared objects of this type shouldn't be copyable. What if it adds a shared constructor?

struct C {
     int a;
     long b;
     this(this) shared { ... }
}

Unfortunately, this(this) is called after the bits have been copied, so if there was any tearing, it already happened. I don't know how we can solve this.

Andrei

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Michel Fortin
in reply to Andrei Alexandrescu

Michel Fortin

Posted in reply to Andrei Alexandrescu

Le 2010-01-14 ? 18:07, Andrei Alexandrescu a ?crit :

> struct A { int a, b; }
> 
> shared A x, y;
> x = y;          // yay or nay?
> 
> I think this should be allowed. This is because A makes no attempt to protect its data by e.g. defining a copy constructor. Anyone could modify a, b in any way - so why not different threads?

That's fine *if* you can read and write atomically. So you need 64-bit atomic reads and writes, otherwise it shouldn't compile.


> Finally, consider this case:
> 
> struct C {
>    int a;
>    long b;
>    ...
> }
> 
> I'd argue that shared objects of this type shouldn't be copyable.

I agree.


> What if it adds a shared constructor?
> 
> struct C {
>    int a;
>    long b;
>    this(this) shared { ... }
> }
> 
> Unfortunately, this(this) is called after the bits have been copied, so if there was any tearing, it already happened. I don't know how we can solve this.

Where is the copy-constructor when we need one? :-)

Now, the really tricky question is should this one be copyable when shared:

	struct D {
		immutable string a;
		long b;
	}

Since 'a' is immutable, you don't need to copy it atomically, only 'b' requires an atomic copy. So copying the struct "atomically" is possible by copying 'a' normally and 'b' atomically. Now, is that going to be supported?


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Andrei Alexandrescu
in reply to Michel Fortin

Andrei Alexandrescu

Posted in reply to Michel Fortin

Michel Fortin wrote:
> Le 2010-01-14 ? 18:07, Andrei Alexandrescu a ?crit :
>> Unfortunately, this(this) is called after the bits have been copied, so if there was any tearing, it already happened. I don't know how we can solve this.
> 
> Where is the copy-constructor when we need one? :-)

I realized that fortunately that's not a problem: during memcpy, the target is not yet shared. Whew. So it all holds water.

this(this) shared { ... }

should work.

> Now, the really tricky question is should this one be copyable when shared:
> 
> struct D { immutable string a; long b; }
> 
> Since 'a' is immutable, you don't need to copy it atomically, only 'b' requires an atomic copy. So copying the struct "atomically" is possible by copying 'a' normally and 'b' atomically. Now, is that going to be supported?

But long isn't atomically copyable. Did you mean int?


Andrei

January 14, 2010

[dmd-concurrency] shared arrays

Posted by Michel Fortin
in reply to Andrei Alexandrescu

Michel Fortin

Posted in reply to Andrei Alexandrescu

Le 2010-01-14 ? 22:01, Andrei Alexandrescu a ?crit :

> Michel Fortin wrote:
>> Le 2010-01-14 ? 18:07, Andrei Alexandrescu a ?crit :
>>> Unfortunately, this(this) is called after the bits have been copied, so if there was any tearing, it already happened. I don't know how we can solve this.
>> Where is the copy-constructor when we need one? :-)
> 
> I realized that fortunately that's not a problem: during memcpy, the target is not yet shared. Whew. So it all holds water.
> 
> this(this) shared { ... }
> 
> should work.

But the source is shared. Couldn't it be updated while memcpy does its work, creating an incoherent state? Something like: memcpy copies half of it, context switch, other thread update first and last bytes, context switch, memcpy finishes its work. Result, last byte of the copy doesn't match first byte.


>> Now, the really tricky question is should this one be copyable when
>> shared:
>> struct D { immutable string a; long b; }
>> Since 'a' is immutable, you don't need to copy it atomically, only
>> 'b' requires an atomic copy. So copying the struct "atomically" is
>> possible by copying 'a' normally and 'b' atomically. Now, is that
>> going to be supported?
> 
> But long isn't atomically copyable. Did you mean int?

Well, that depends on the architecture. I meant a type which is atomically copyable yes.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation