View mode: basic / threaded / horizontal-split · Log in · Help
June 06, 2012
should pure functions accept/deal with shared data?
An interesting situation, the current compiler happily will compile pure  
functions that accept shared data.

I believed when we relaxed purity rules, shared data should be taboo for  
pure functions, even weak-pure ones.  Note that at least at the time, Don  
agreed with me: http://forum.dlang.org/post/i7d60m$2smf$1@digitalmars.com

Now, technically, there's nothing really *horrible* about this, I mean you  
can't really have truly shared data inside a strong-pure function.  Any  
data that's marked as 'shared' will not be shared because a strong-pure  
function cannot receive any shared data.

So if you then were to call a weak-pure function that had shared  
parameters from a strong-pure function, you simply would be wasting cycles  
locking or using a memory-barrier on data that is not truly shared.  I  
don't really see a compelling reason to have weak-pure functions accept  
shared data explicitly.

*Except* that template functions which use IFTI have good reason to be  
able to be marked pure.

For example:

void inc(T)(ref T i) pure
{
   ++i;
}

Now, we have a template function that we know only will affect i, and the  
compiler enforces that.

But what happens here?

shared int x;

void main()
{
   x.inc();
}

here, T == shared int.

One solution (if shared isn't allowed on pure functions) is, don't mark  
inc pure, let it be inferred.  But then we are losing the contract to have  
the compiler help us enforce purity.

I'll also point out that inc isn't a valid function for data that is  
actually shared: ++i is not atomic.  So disallowing shared actually helps  
us in this regard, by refusing to compile a function that would be  
dangerous when used on shared data.

The compiler *currently* however, will simply compile this just fine.

I'm strongly leaning towards this being a bug, and needs to be fixed in  
the compiler.

Some background of why this got brought up:  
https://github.com/D-Programming-Language/druntime/pull/147

Opinions?

-Steve
June 06, 2012
Re: should pure functions accept/deal with shared data?
On 06-06-2012 23:39, Steven Schveighoffer wrote:
> An interesting situation, the current compiler happily will compile pure
> functions that accept shared data.
>
> I believed when we relaxed purity rules, shared data should be taboo for
> pure functions, even weak-pure ones. Note that at least at the time, Don
> agreed with me: http://forum.dlang.org/post/i7d60m$2smf$1@digitalmars.com
>
> Now, technically, there's nothing really *horrible* about this, I mean
> you can't really have truly shared data inside a strong-pure function.
> Any data that's marked as 'shared' will not be shared because a
> strong-pure function cannot receive any shared data.
>
> So if you then were to call a weak-pure function that had shared
> parameters from a strong-pure function, you simply would be wasting
> cycles locking or using a memory-barrier on data that is not truly
> shared. I don't really see a compelling reason to have weak-pure
> functions accept shared data explicitly.
>
> *Except* that template functions which use IFTI have good reason to be
> able to be marked pure.
>
> For example:
>
> void inc(T)(ref T i) pure
> {
> ++i;
> }
>
> Now, we have a template function that we know only will affect i, and
> the compiler enforces that.
>
> But what happens here?
>
> shared int x;
>
> void main()
> {
> x.inc();
> }
>
> here, T == shared int.
>
> One solution (if shared isn't allowed on pure functions) is, don't mark
> inc pure, let it be inferred. But then we are losing the contract to
> have the compiler help us enforce purity.
>
> I'll also point out that inc isn't a valid function for data that is
> actually shared: ++i is not atomic. So disallowing shared actually helps
> us in this regard, by refusing to compile a function that would be
> dangerous when used on shared data.

Man, shared is such a mess.

(I'm going to slightly hijack a branch of your thread because I think we 
need to address the below concerns before we can make this decision 
properly.)

We need to be crystal clear on what we're talking about here. Usually, 
people refer to shared as being supposed to insert memory barriers. 
Others call operations on shared data atomic.

(And of course, neither is actually implemented in any compiler, and I 
doubt they ever will be.)

A memory barrier is what the x86 sfence, lfence, and mfence instructions 
represent. They simply make various useful guarantees about ordering of 
loads and stores. Nothing else.

Atomic operations are what the lock prefix is used for, for example the 
lock add operation, lock cmpxchg, etc. These operate on the most recent 
value at whatever memory location is being operated on, i.e. caches are 
circumvented.

Memory barriers and atomic operations are not the same thing, and we 
should avoid conflating them. Yes, they can be used together to write 
low-level, lock-free data structures, but the use of one does not 
include the other automatically.

(At this point, I probably don't need to point out how x86-biased and 
unportable shared is.....)

So, my question to the community is: What should shared *really* mean?

I don't think that having shared imply memory barriers is going to be 
terribly useful to anyone. In fact, I don't know how the compiler would 
even determine where to efficiently insert memory barriers. And 
*actually*, I think memory barriers is really not what people mean at 
*all* when they refer to shared's effect on code generation. I think 
what people *really* want is atomic operations.

Steven, in your particular case, I don't agree entirely. The operation 
can be atomic quite trivially by implementing inc() like so (for the 
shared int case):

void inc(ref shared int i) pure nothrow
{
    // just pretend the compiler emitted this
    asm
    {
        mov EDX, i;
        lock;
        inc [EDX];
    }
}

But I may be misunderstanding you. Of course, it gets a little more 
complex if you use the result of the ++ operation afterwards, but it's 
still not impossible to do atomically. What can *not* be done is doing 
the increment and loading the result in one purely atomic instruction 
(and I suspect this is what you may have been referring to). It's worth 
pointing out that most atomic operations can be implemented with a spin 
lock (which is exactly what core.atomic does for most binary 
operations), so while it cannot be done with an x86 instruction, it can 
be achieved through such a mechanism, and most real world atomic APIs do 
this (see InterlockedIncrement in Windows for example).

Further, if shared is going to be useful at all, stuff like this *has* 
to be atomic, IMO.

I'm still of the opinion that bringing atomicity and memory barriers 
into the type system is a horrible can of worms that we should never 
have opened, but now shared is there and we need to make up our minds 
already.

>
> The compiler *currently* however, will simply compile this just fine.
>
> I'm strongly leaning towards this being a bug, and needs to be fixed in
> the compiler.
>
> Some background of why this got brought up:
> https://github.com/D-Programming-Language/druntime/pull/147
>
> Opinions?
>
> -Steve

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 07, 2012
Re: should pure functions accept/deal with shared data?
On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
> (And of course, neither is actually implemented in any compiler, and I
> doubt they ever will be.)

Why do you doubt shared semantics will be implemented?

Andrei
June 07, 2012
Re: should pure functions accept/deal with shared data?
On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
> (At this point, I probably don't need to point out how x86-biased and
> unportable shared is.....)

I confess I'll need that spelled out. How is shared biased towards x86 
and nonportable?

Thanks,

Andrei
June 07, 2012
Re: should pure functions accept/deal with shared data?
On 07-06-2012 03:11, Andrei Alexandrescu wrote:
> On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
>> (At this point, I probably don't need to point out how x86-biased and
>> unportable shared is.....)
>
> I confess I'll need that spelled out. How is shared biased towards x86
> and nonportable?
>
> Thanks,
>
> Andrei

The issue lies in its assumption that the architecture being targeted 
supports atomic operations and/or memory barriers at all. Some 
architectures plain don't support these, others do, but for certain data 
sizes like 64-bit ints, they don't, etc. x86 is probably the 
architecture that has the best support for low-level memory control as 
far as atomicity and memory barriers go.

The problem is that shared is supposed to guarantee that operations on 
shared data *always* obeys whatever atomicity/memory barrier rules we 
end up defining for it (obviously we don't want generated code to have 
different semantics across architectures due to subtle issues like the 
lack of certain operations in the ISA). Right now, based on what I've 
read in the NG and on mailing lists, people seem to assume that shared 
will provide full-blown x86-level atomicity and/or memory barriers. 
Providing these features on e.g. ARM is a pipe dream at best (for 
instance, ARM has no atomic load for 64-bit values).

All this being said, shared could probably be implemented with plain old 
locks on these architectures if correctness is the only goal. But, from 
a more pragmatic point of view, this would completely butcher 
performance and adds potential for deadlocks, and all other issues 
associated with thread synchronization in general. We really shouldn't 
have such a core feature of the language fall back to a dirty hack like 
this on low-end/embedded architectures (where performance of this kind 
of stuff is absolutely critical), IMO.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 07, 2012
Re: should pure functions accept/deal with shared data?
On 07-06-2012 03:07, Andrei Alexandrescu wrote:
> On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
>> (And of course, neither is actually implemented in any compiler, and I
>> doubt they ever will be.)
>
> Why do you doubt shared semantics will be implemented?
>
> Andrei
>

I think there are two fundamental issues making implementation difficult 
and unlikely to happen:

1) The x86 bias (I replied to your other post wrt this).
2) The overall complexity of generating correct code for shared.

If we ignore the portability issues that I pointed out in my other 
reply, point (1) is irrelevant. I'm fairly certain the shared semantics 
that people expect can be implemented just fine at ISA level on x86 
without dirty hacks like locks. But if we do care about portability 
(which we ***really*** should - ARM and PowerPC, for example, are 
becoming increasingly important!), then we need to reconsider shared 
very carefully.

The thing about (2) is that literally every operation on shared data has 
to be special-cased in the compiler. This adds a crazy amount of 
complexity, since there are basically two code paths for every single 
part of the code generation phase: the unshared path and the shared 
path. This is mostly caused by the fact that shared is transitive and 
can be applied to virtually any type. But even if we ignore that 
complexity, we have the problem of certain operations that cannot be 
atomic (even though they may look like it):

* Would you expect an array indexing operation (where the array slice is 
shared) to index the array atomically? Would you expect the read of the 
value at the calculated memory location to be atomic?
* Would you expect a slicing operation to be atomic? (Slicing something 
involves reading two words of memory which cannot be done atomically 
even on x86.)
* Would you expect 'in' to be atomic? (It can only really, kinda-sorta 
be if you use locks inside the AA implementation...)

etc.

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
June 07, 2012
Re: should pure functions accept/deal with shared data?
Funny, I always imagined 'shared' was just a type constructor 
like 'const'... meant to help you check the correctness of your 
code.

Not that it would actually /do/ something for you at runtime...
June 07, 2012
Re: should pure functions accept/deal with shared data?
On 6/6/12 8:19 PM, Alex Rønne Petersen wrote:
> On 07-06-2012 03:11, Andrei Alexandrescu wrote:
>> On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
>>> (At this point, I probably don't need to point out how x86-biased and
>>> unportable shared is.....)
>>
>> I confess I'll need that spelled out. How is shared biased towards x86
>> and nonportable?
>>
>> Thanks,
>>
>> Andrei
>
> The issue lies in its assumption that the architecture being targeted
> supports atomic operations and/or memory barriers at all. Some
> architectures plain don't support these, others do, but for certain data
> sizes like 64-bit ints, they don't, etc. x86 is probably the
> architecture that has the best support for low-level memory control as
> far as atomicity and memory barriers go.

Actually x86 is one of the more forgiving architectures (most code works 
even when written without barriers). Indeed we assume the target 
architecture supports double-word atomic load.

> The problem is that shared is supposed to guarantee that operations on
> shared data *always* obeys whatever atomicity/memory barrier rules we
> end up defining for it (obviously we don't want generated code to have
> different semantics across architectures due to subtle issues like the
> lack of certain operations in the ISA). Right now, based on what I've
> read in the NG and on mailing lists, people seem to assume that shared
> will provide full-blown x86-level atomicity and/or memory barriers.
> Providing these features on e.g. ARM is a pipe dream at best (for
> instance, ARM has no atomic load for 64-bit values).

http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html mentions that 
there is a way to implement atomic load for 64-bit values.

> All this being said, shared could probably be implemented with plain old
> locks on these architectures if correctness is the only goal. But, from
> a more pragmatic point of view, this would completely butcher
> performance and adds potential for deadlocks, and all other issues
> associated with thread synchronization in general. We really shouldn't
> have such a core feature of the language fall back to a dirty hack like
> this on low-end/embedded architectures (where performance of this kind
> of stuff is absolutely critical), IMO.

That's how C++'s atomic<T> does things, by the way. But I sympathize 
with your viewpoint that there should be no hidden locks. We could 
define shared to refuse compilation on odd machines, and THEN provide an 
atomic template with the expected performance of a lock.


Andrei
June 07, 2012
Re: should pure functions accept/deal with shared data?
On 6/6/12 8:32 PM, Alex Rønne Petersen wrote:
> On 07-06-2012 03:07, Andrei Alexandrescu wrote:
>> On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
>>> (And of course, neither is actually implemented in any compiler, and I
>>> doubt they ever will be.)
>>
>> Why do you doubt shared semantics will be implemented?
>>
>> Andrei
>>
>
> I think there are two fundamental issues making implementation difficult
> and unlikely to happen:
>
> 1) The x86 bias (I replied to your other post wrt this).
> 2) The overall complexity of generating correct code for shared.
>
> If we ignore the portability issues that I pointed out in my other
> reply, point (1) is irrelevant. I'm fairly certain the shared semantics
> that people expect can be implemented just fine at ISA level on x86
> without dirty hacks like locks. But if we do care about portability
> (which we ***really*** should - ARM and PowerPC, for example, are
> becoming increasingly important!), then we need to reconsider shared
> very carefully.

Agreed. I don't think (1) is irrelevant but I don't see it an impossible 
obstacle.

> The thing about (2) is that literally every operation on shared data has
> to be special-cased in the compiler.

Yes, like for volatile (no code motion etc) plus barriers.

> This adds a crazy amount of
> complexity, since there are basically two code paths for every single
> part of the code generation phase: the unshared path and the shared
> path. This is mostly caused by the fact that shared is transitive and
> can be applied to virtually any type. But even if we ignore that
> complexity, we have the problem of certain operations that cannot be
> atomic (even though they may look like it):
>
> * Would you expect an array indexing operation (where the array slice is
> shared) to index the array atomically? Would you expect the read of the
> value at the calculated memory location to be atomic?
> * Would you expect a slicing operation to be atomic? (Slicing something
> involves reading two words of memory which cannot be done atomically
> even on x86.)
> * Would you expect 'in' to be atomic? (It can only really, kinda-sorta
> be if you use locks inside the AA implementation...)

Operations with shared data are intentionally very limited. Indexing, 
slicing, and "in" should not compile for shared arrays and associative 
arrays.


Andrei
June 07, 2012
Re: should pure functions accept/deal with shared data?
On 07-06-2012 04:00, Andrei Alexandrescu wrote:
> On 6/6/12 8:32 PM, Alex Rønne Petersen wrote:
>> On 07-06-2012 03:07, Andrei Alexandrescu wrote:
>>> On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
>>>> (And of course, neither is actually implemented in any compiler, and I
>>>> doubt they ever will be.)
>>>
>>> Why do you doubt shared semantics will be implemented?
>>>
>>> Andrei
>>>
>>
>> I think there are two fundamental issues making implementation difficult
>> and unlikely to happen:
>>
>> 1) The x86 bias (I replied to your other post wrt this).
>> 2) The overall complexity of generating correct code for shared.
>>
>> If we ignore the portability issues that I pointed out in my other
>> reply, point (1) is irrelevant. I'm fairly certain the shared semantics
>> that people expect can be implemented just fine at ISA level on x86
>> without dirty hacks like locks. But if we do care about portability
>> (which we ***really*** should - ARM and PowerPC, for example, are
>> becoming increasingly important!), then we need to reconsider shared
>> very carefully.
>
> Agreed. I don't think (1) is irrelevant but I don't see it an impossible
> obstacle.
>
>> The thing about (2) is that literally every operation on shared data has
>> to be special-cased in the compiler.
>
> Yes, like for volatile (no code motion etc) plus barriers.
>
>> This adds a crazy amount of
>> complexity, since there are basically two code paths for every single
>> part of the code generation phase: the unshared path and the shared
>> path. This is mostly caused by the fact that shared is transitive and
>> can be applied to virtually any type. But even if we ignore that
>> complexity, we have the problem of certain operations that cannot be
>> atomic (even though they may look like it):
>>
>> * Would you expect an array indexing operation (where the array slice is
>> shared) to index the array atomically? Would you expect the read of the
>> value at the calculated memory location to be atomic?
>> * Would you expect a slicing operation to be atomic? (Slicing something
>> involves reading two words of memory which cannot be done atomically
>> even on x86.)
>> * Would you expect 'in' to be atomic? (It can only really, kinda-sorta
>> be if you use locks inside the AA implementation...)
>
> Operations with shared data are intentionally very limited. Indexing,
> slicing, and "in" should not compile for shared arrays and associative
> arrays.

Are you suggesting that we define shared such that any 'basic' operation 
that can't be atomic on the target architecture is disallowed? Or that 
we figure out the lowest common denominator and go with that?

>
>
> Andrei
>
>

-- 
Alex Rønne Petersen
alex@lycus.org
http://lycus.org
« First   ‹ Prev
1 2 3
Top | Discussion index | About this forum | D home