Jump to page: 1 2
Thread overview
Forcing weak-pure
Mar 25, 2014
monarch_dodra
Mar 25, 2014
Artur Skawina
Mar 25, 2014
Artur Skawina
Mar 26, 2014
Kagamin
Mar 26, 2014
Artur Skawina
Apr 05, 2014
Artur Skawina
March 25, 2014
There has been an increased effort as of late to make as much as possible inside the runtime @safe, pure, and nothrow. The goal is to make it so it's not as nasty writing such code, because the library will allow it. Everything that we mark properly down below will allow higher-level constructs to be marked as well.

Of course naturally, with all things this low-level, we have to make decisions as to what is *provably* @safe/pure/nothrow, and what is *logically* @safe/pure/nothrow. The most obvious example is memory allocation -- It is technically not pure as it accesses global state (the heap), but we can rationalize making it pure since we have more logical knowledge of the situation than the compiler (i.e. locking the global heap, and realizing that if a block is unallocated, it is uniquely accessed with the allocation of it).

One issue, however, is modification of heap metadata for immutable data. The D compiler has the (correct) rationalization that for a pure function that has only immutable or implicitly convertible to immutable parameters, it is strong-pure. This means, it can optimize away calls to such functions. But in the case of modifying heap metadata, the call is actually weak-pure, since it is making modifications that you *don't* want to optimize away.

For this reason, functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example:

auto str = "hello".idup;
str = str[0..1];
str.assumeSafeAppend();
str ~= "iya";

The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable.

GC.setAttr has similar issues.

How can we force these to be weak-pure? One option suggested is to add a hidden void * parameter that defaults to null, to force the issue. But I think future compilers may be smart enough to realize that can also be a strong-pure call.

Another possibility is to pass the array/pointer by reference. This causes its own problems with rvalues, and does not help when the array/pointer itself is immutable.

Do we need a way (maybe a pragma) to make a function inside druntime "forced weak" pure? Is there an option any of you wonderful geniuses out there can figure out without changes to the compiler? :)

Some reference material:

https://github.com/D-Programming-Language/druntime/pull/553
https://github.com/D-Programming-Language/druntime/pull/747
https://github.com/D-Programming-Language/phobos/pull/2025
https://github.com/D-Programming-Language/phobos/pull/2046

-Steve
March 25, 2014
On Tuesday, 25 March 2014 at 13:30:04 UTC, Steven Schveighoffer wrote:
> How can we force these to be weak-pure? One option suggested is to add a hidden void * parameter that defaults to null, to force the issue. But I think future compilers may be smart enough to realize that can also be a strong-pure call.

Maybe instead of using null, we could use global junk?

//Global junk
struct WeakPure{int a;}
__gshared WeakPure dummyWeakPure;


//Signature
void weakPureFun(WeakPure* p = &dummyWeakPure)
{}


//Useage
weakPureFun();



AFAIK, the compiler should NOT be able to infer strong purity here. I don't know about performance effects though.
March 25, 2014
On Tue, 25 Mar 2014 10:08:20 -0400, monarch_dodra <monarchdodra@gmail.com> wrote:

> On Tuesday, 25 March 2014 at 13:30:04 UTC, Steven Schveighoffer wrote:
>> How can we force these to be weak-pure? One option suggested is to add a hidden void * parameter that defaults to null, to force the issue. But I think future compilers may be smart enough to realize that can also be a strong-pure call.
>
> Maybe instead of using null, we could use global junk?
>
> //Global junk
> struct WeakPure{int a;}
> __gshared WeakPure dummyWeakPure;
>
>
> //Signature
> void weakPureFun(WeakPure* p = &dummyWeakPure)
> {}
>
>
> //Useage
> weakPureFun();
>
>
>
> AFAIK, the compiler should NOT be able to infer strong purity here. I don't know about performance effects though.

I think again, a sufficiently intelligent compiler could see that dummyWeakPure would not be used in the template, we likely would have to pass it to the underlying opaque function. I'd really like to avoid this option, if it's not going to be optimized down to the minimal case.

-Steve
March 25, 2014
On 03/25/14 14:30, Steven Schveighoffer wrote:
> [...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example:
> 
> auto str = "hello".idup;
> str = str[0..1];
> str.assumeSafeAppend();
> str ~= "iya";
> 
> The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable.
> 
> GC.setAttr has similar issues.
> 
> How can we force these to be weak-pure?

Functions returning 'void' and w/o mutable args cannot be logically pure, as long as they aren't no-ops, obviously. While this property could be used to render them "weak-pure" in d-speak, this (or any other approach to marking them as such) would not be enough...

   // assuming 'assumeSafeAppend()' is "weak-pure":

   string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; }

   string a = "abc".idup, b = f(a[0..2]);

artur
March 25, 2014
On 25/03/14 15:08, monarch_dodra wrote:
> Maybe instead of using null, we could use global junk?

That really seems like a nasty workaround which could break at any moment depending on how the compiler is tweaked. :-(

Better surely to have an explicit indication that a function may not be assumed to be strongly pure?
March 25, 2014
On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09@gmail.com> wrote:

> On 03/25/14 14:30, Steven Schveighoffer wrote:
>> [...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example:
>>
>> auto str = "hello".idup;
>> str = str[0..1];
>> str.assumeSafeAppend();
>> str ~= "iya";
>>
>> The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable.
>>
>> GC.setAttr has similar issues.
>>
>> How can we force these to be weak-pure?
>
> Functions returning 'void' and w/o mutable args cannot be logically pure,
> as long as they aren't no-ops, obviously. While this property could be
> used to render them "weak-pure" in d-speak, this (or any other approach
> to marking them as such) would not be enough...
>
>    // assuming 'assumeSafeAppend()' is "weak-pure":
>
>    string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; }
>   string a = "abc".idup, b = f(a[0..2]);

This could not be elided, because you are using the return value. It would have to be called at least once.

However, I am OK with it being elided for a second call with the same input. In other words, this strong pure function does not have the same issues.

But I get what you are saying -- just wrap it again, and we're back to the same issue. It is a systematic problem, because the concept is that you are not passing mutable references, even though you are then using those references to mark global data.

There is no really good answer I suppose. The huge downside of not marking assumeSafeAppend as weak-pure is that one cannot use assumeSafeAppend or set attributes inside pure functions that deal with arrays with immutable data. For example, the Appender type cannot be marked pure for immutable data.

-Steve
March 25, 2014
On 03/25/14 21:51, Steven Schveighoffer wrote:
> On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09@gmail.com> wrote:
> 
>> On 03/25/14 14:30, Steven Schveighoffer wrote:
>>> [...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example:
>>>
>>> auto str = "hello".idup;
>>> str = str[0..1];
>>> str.assumeSafeAppend();
>>> str ~= "iya";
>>>
>>> The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable.
>>>
>>> GC.setAttr has similar issues.
>>>
>>> How can we force these to be weak-pure?
>>
>> Functions returning 'void' and w/o mutable args cannot be logically pure, as long as they aren't no-ops, obviously. While this property could be used to render them "weak-pure" in d-speak, this (or any other approach to marking them as such) would not be enough...
>>
>>    // assuming 'assumeSafeAppend()' is "weak-pure":
>>
>>    string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; }
>>    string a = "abc".idup, b = f(a[0..2]);
> 
> This could not be elided, because you are using the return value. It would have to be called at least once.
> 
> However, I am OK with it being elided for a second call with the same input. In other words, this strong pure function does not have the same issues.

It's ok to treat allocator and factory functions as pure, because those really
are logically pure, ie can't affect any /visible/ state and return results that
are unique.
'assumeSafeAppend()' does have side effects - it can lead to corruption of other,
not directly related, data. The only thing that prevents this is a declaration
by the programmer -- "Trust me, I own this data and there are no other live aliases".
The problem is that this declaration would now happen /implicitly when calling
a (strongly) pure function/.
Note that the 'a' string in the above example could no longer be "abc" after 'f()'
runs.
The "pure" concept (re-)definition in D is bad enough even without having to worry
about supposedly pure functions stomping on unrelated data.


> But I get what you are saying -- just wrap it again, and we're back to the same issue. It is a systematic problem, because the concept is that you are not passing mutable references, even though you are then using those references to mark global data.
> 
> There is no really good answer I suppose. The huge downside of not marking assumeSafeAppend as weak-pure is that one cannot use assumeSafeAppend or set attributes inside pure functions that deal with arrays with immutable data. For example, the Appender type cannot be marked pure for immutable data.

If there was a version of assumeSafeAppend() marked as pure, then that one could
be used for such cases, where no aliases are allowed to escape before an ownership
transfer happens. It's much easier to audit a contained implementation (eg Appender)
than to check the whole program and every caller of every function which relies on
a potentially unsafe assumption. If 'assumeSafeAppend()' isn't pure /by default/
then D's purity guarantees are at least a little bit stronger.

Or maybe such a change (marking unsafe code as pure and hoping that the programmer knows exactly what (s)he's doing) wouldn't really make the situation significantly worse than it already is - I'm not sure...

artur
March 26, 2014
On Tue, 25 Mar 2014 19:30:04 -0400, Artur Skawina <art.08.09@gmail.com> wrote:

> On 03/25/14 21:51, Steven Schveighoffer wrote:
>> On Tue, 25 Mar 2014 14:49:27 -0400, Artur Skawina <art.08.09@gmail.com> wrote:
>>
>>> On 03/25/14 14:30, Steven Schveighoffer wrote:
>>>> [...] functions like GC.setAttr and assumeSafeAppend cannot be marked pure. For example:
>>>>
>>>> auto str = "hello".idup;
>>>> str = str[0..1];
>>>> str.assumeSafeAppend();
>>>> str ~= "iya";
>>>>
>>>> The compiler could rationally elide the call to assumeSafeAppend if it is pure. We are not using the return value, and the only parameter is immutable. Since pure functions technically have no side effects, this call can be eliminated. A recent compiler change made such calls warnings (not using result of strong-pure function). But assumeSafeAppend really should be weak-pure, because it does have an effect. In essence, you are technically passing to assumeSafeAppend a pointer to the block that contains the slice, not the slice itself. And that block is mutable.
>>>>
>>>> GC.setAttr has similar issues.
>>>>
>>>> How can we force these to be weak-pure?
>>>
>>> Functions returning 'void' and w/o mutable args cannot be logically pure,
>>> as long as they aren't no-ops, obviously. While this property could be
>>> used to render them "weak-pure" in d-speak, this (or any other approach
>>> to marking them as such) would not be enough...
>>>
>>>    // assuming 'assumeSafeAppend()' is "weak-pure":
>>>
>>>    string f(string s) pure { s.assumeSafeAppend(); s ~= "d"; return s; }
>>>    string a = "abc".idup, b = f(a[0..2]);
>>
>> This could not be elided, because you are using the return value. It would have to be called at least once.
>>
>> However, I am OK with it being elided for a second call with the same input. In other words, this strong pure function does not have the same issues.
>
> It's ok to treat allocator and factory functions as pure, because those really
> are logically pure, ie can't affect any /visible/ state and return results that
> are unique.
> 'assumeSafeAppend()' does have side effects - it can lead to corruption of other,
> not directly related, data. The only thing that prevents this is a declaration
> by the programmer -- "Trust me, I own this data and there are no other live aliases".
> The problem is that this declaration would now happen /implicitly when calling
> a (strongly) pure function/.
> Note that the 'a' string in the above example could no longer be "abc" after 'f()'
> runs.
> The "pure" concept (re-)definition in D is bad enough even without having to worry
> about supposedly pure functions stomping on unrelated data.

You what I wasn't thinking of? parallelization of pure function calls! Right there, that kills this whole idea.

In fact, I think at this point, any operation to immutable arrays that affects the metadata for that block, including setting length and appending, must be by definition impure.

This is not a good turn of events...

-Steve
March 26, 2014
On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:
> It's ok to treat allocator and factory functions as pure, because those really
> are logically pure, ie can't affect any /visible/ state and return results that
> are unique.

Allocators don't return unique results, GC being the primary example.
March 26, 2014
On 03/26/14 15:36, Kagamin wrote:
> On Tuesday, 25 March 2014 at 23:30:28 UTC, Artur Skawina wrote:
>> It's ok to treat allocator and factory functions as pure, because those really are logically pure, ie can't affect any /visible/ state and return results that are unique.
> 
> Allocators don't return unique results, GC being the primary example.

I'm not really sure what your point is. I was saying that "allocator functions",
ie functions that return newly allocated objects, can be treated as pure as long
they do not have any other visible side effects and the result is "unique", ie does
not have any aliases. This will be true in many cases. The /implementation/ won't
likely be pure, but from the caller's POV this is irrelevant. A way to make those
functions appear pure would allow for more pure /callers/; that's why Steven was
asking for a way to "force" purity.
Even C compilers are treating 'malloc' etc specially, exactly because they can
assume that nothing aliases the returned object.

artur
« First   ‹ Prev
1 2