April 27, 2012
On Fri, 27 Apr 2012 13:25:30 -0400, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:

> On 27/04/12 19:08, Steven Schveighoffer wrote:
>> weak purity, I think, is one of the most revolutionary ideas to come from D.
>> Essentially, we get compiler-checked pure functions that can be written
>> imperatively instead of functionally.
>
> I do agree with that; it's a very attractive aspect of D that a function or object can be internally imperative but pure as far as the outside world is concerned.  Seeing that documented in Andrei's writings was another "Wow!" moment for me about D. :-)
>
> What concerned me here was that the way my reputation() function was set up actually did change the public state of the object, which to my mind isn't even weakly pure.  But I just wrote a fix which, contrary to my fears, didn't affect the speed of the program.

Weakly pure simply means it can be called by a strong-pure function.  It has no other benefits except for that, and is basically a normal non-pure function otherwise.  It's a weird thing, and it's hard to wrap your mind around.  I don't think anyone has ever done it before, so it's hard to explain.

But the benefits are *enormous*.  Now, instead of pure versions of so many normal functions, much of the standard library can be marked as pure, and callable from either pure or unpure functions.  It opens up the whole library to pure functions which would otherwise be a painstaking porting effort.

>> If the function is weakly pure, then it cannot be optimized based on purity, so
>> I don't think there is any way marking it pure *hurts*.
>
> I was more concerned that the compiler wasn't identifying what to me was a violation of purity.  I'm fairly sure I can also find a way to make some of those "nothrow" functions throw an error ...

It depends on what your definition of "purity" is.  For D's definition, it's pure, otherwise the compiler would reject it.

The definition of pure for D is, it cannot accept any shared data as parameters, and it cannot access any global non-immutable data.  That's it.  There's no requirement for immutability of parameters or results.

In hindsight, D's 'pure' keyword really should be named something else, since it varies from the traditional definition of 'pure'.  But it is what it is now, and no changing it.

nothrow means it cannot throw an Exception.  It can still throw an error, or technically even something else derived from Throwable.  This is necessary, because otherwise, nothing could be nothrow (anything might throw an out of memory error).

-Steve
April 27, 2012
On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
> On 27/04/12 19:08, Steven Schveighoffer wrote:
> >weak purity, I think, is one of the most revolutionary ideas to come from D.  Essentially, we get compiler-checked pure functions that can be written imperatively instead of functionally.
> 
> I do agree with that; it's a very attractive aspect of D that a function or object can be internally imperative but pure as far as the outside world is concerned.  Seeing that documented in Andrei's writings was another "Wow!" moment for me about D. :-)
> 
> What concerned me here was that the way my reputation() function was set up actually did change the public state of the object, which to my mind isn't even weakly pure.  But I just wrote a fix which, contrary to my fears, didn't affect the speed of the program.

As mentioned by others, D internally recognizes two kinds of purity, which is unofficially called "strong purity" and "weak purity". The idea is that strong purity corresponds with what functional programming languages call "pure", whereas weak purity allows mutation of state outside the function, *but only through function parameters*.

The idea is that strongly pure functions are allowed to call weakly pure functions, provided any reference parameters passed in never escape the scope of the strongly pure function. For example, a strongly pure function is allowed to pass in pointers to local variables which the weakly pure function can modify through the pointer:

	pure real computationHelper(real arg1, ref real arg2) {
		// This mutates arg2, so this function is only weakly
		// pure
		arg2 = sin(arg1);

		return cos(arg1);
	}

	pure real complexComputation(real[] args) {
		real tempValue;
		...

		// Note: this changes tempValue
		real anotherTempValue = computationHelper(args[0],
					tempValue);

		// But since tempValue is only used inside this
		// function, it does not violate strong purity
		return tempValue + complicatedFunc(args[1]);
	}

As far as the outside world is concerned, complexComputation is a strongly pure function, because even though computationHelper mutates stuff through its arguments, those mutations are all local to complexComputation, and does not actually touch global world state outside. Weakly pure functions cannot mutate anything that they don't receive through their arguments (the 'this' pointer is considered part of their arguments, it's just implicit), so as long as the strongly pure function never passes in a pointer to the outside world, everything is OK.

The motivation for distinguishing between these types of purity is to expand the scope of what the strongly-pure function can call. By allowing weakly pure functions to be called from strongly-pure functions, we open up many more implementation possibilities while still retaining all the benefits of strong purity.


> >If the function is weakly pure, then it cannot be optimized based on purity, so I don't think there is any way marking it pure *hurts*.
> 
> I was more concerned that the compiler wasn't identifying what to me was a violation of purity.  I'm fairly sure I can also find a way to make some of those "nothrow" functions throw an error ...

It's not a violation of purity, it's just "weak purity". If you try to access a global variable, for example, it will trigger an error.

And nothrow functions *are* allowed to throw Error objects. That's also a deliberate decision. :-)


T

-- 
"640K ought to be enough" -- Bill G., 1984.
"The Internet is not a primary goal for PC usage" -- Bill G., 1995.
"Linux has no impact on Microsoft's strategy" -- Bill G., 1999.
April 27, 2012
On Fri, 27 Apr 2012 13:23:46 -0400, Jonathan M Davis <jmdavisProg@gmx.com> wrote:

> On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
>> const should not affect code generation *at all*, except for name mangling
>> (const MyStruct is a different type from MyStruct), and generating an
>> extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely a
>> compile-time concept.
>
> Thanks to the fact that const is transitive and that it's illegal to cast it
> away to use mutation, const _can_ affect code optimizations, but I don't know
> exactly under what circumstances it does in the current compiler.

No, it can't.  There can easily be another non-const reference to the same data.  Pure functions can make more assumptions, based on the types, but it would be a very complex determination in the type system to see if two parameters alias the same data.

Real optimization benefits come into play when immutable is there.

-Steve
April 27, 2012
On 27/04/12 20:25, H. S. Teoh wrote:
> On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
>> I was more concerned that the compiler wasn't identifying what to me
>> was a violation of purity.  I'm fairly sure I can also find a way to
>> make some of those "nothrow" functions throw an error ...
>
> It's not a violation of purity, it's just "weak purity". If you try to
> access a global variable, for example, it will trigger an error.

Thanks for the extended description of weak purity -- it's been very helpful in understanding the concept better.

Is there a particular way in which I can explicitly mark a function as strongly pure?

> And nothrow functions *are* allowed to throw Error objects. That's also
> a deliberate decision. :-)

... yes, as I just found out when I decided to test it 2 minutes ago :-)  OTOH I found that with or without the nothrow option, when the -release flag was used in compiling the code, the error was not thrown and the program did not exit -- it just sat there seemingly running but doing nothing.  This was unexpected ...

The deliberate error was in this case a range exception when accessing an array.
April 27, 2012
On Fri, 27 Apr 2012 14:29:40 -0400, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:

> On 27/04/12 20:25, H. S. Teoh wrote:
>> On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
>>> I was more concerned that the compiler wasn't identifying what to me
>>> was a violation of purity.  I'm fairly sure I can also find a way to
>>> make some of those "nothrow" functions throw an error ...
>>
>> It's not a violation of purity, it's just "weak purity". If you try to
>> access a global variable, for example, it will trigger an error.
>
> Thanks for the extended description of weak purity -- it's been very helpful in understanding the concept better.
>
> Is there a particular way in which I can explicitly mark a function as strongly pure?

No, just make sure all the parameters and the result are either immutable or implicitly castable to immutable (hard to explain this better).

Hm... gives me a thought that unit tests should have a helper that allows ensuring this:

static assert(isStrongPure!fn);

Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a library.

> ... yes, as I just found out when I decided to test it 2 minutes ago :-)  OTOH I found that with or without the nothrow option, when the -release flag was used in compiling the code, the error was not thrown and the program did not exit -- it just sat there seemingly running but doing nothing.  This was unexpected ...
>
> The deliberate error was in this case a range exception when accessing an array.

array bounds checks and asserts are turned off during -release compilation :)

-Steve
April 27, 2012
On Friday, April 27, 2012 14:26:52 Steven Schveighoffer wrote:
> On Fri, 27 Apr 2012 13:23:46 -0400, Jonathan M Davis <jmdavisProg@gmx.com>
> 
> wrote:
> > On Friday, April 27, 2012 11:18:26 Steven Schveighoffer wrote:
> >> const should not affect code generation *at all*, except for name
> >> mangling
> >> (const MyStruct is a different type from MyStruct), and generating an
> >> extra TypeInfo for const MyStruct and const MyStruct[]. Const is purely
> >> a
> >> compile-time concept.
> > 
> > Thanks to the fact that const is transitive and that it's illegal to
> > cast it
> > away to use mutation, const _can_ affect code optimizations, but I don't
> > know
> > exactly under what circumstances it does in the current compiler.
> 
> No, it can't. There can easily be another non-const reference to the same data.

Except that if the variable isn't shared, then in many cases, it doesn't matter that there can be another reference - particularly when combined with pure. So, the compiler definitely should be able to use const for optimizations in at least some cases. It wouldn't surprise me if it doesn't use it for that all that much right now though.

> Pure functions can make more assumptions, based on the types, but
> it would be a very complex determination in the type system to see if two
> parameters alias the same data.

Yes. But that doesn't matter in many cases thanks to D's thread-local by default. With shared, on the other hand, const becomes useless for optimizations for the very reason that you give. And yes, there _are_ cases where const can't be used for optimizations because other references are used which might refer to the same data, but particularly if no other references of the same type are used in a section of code, then the compiler should be able to determine that the object really hasn't changed thanks to const.

> Real optimization benefits come into play when immutable is there.

Definitely. immutable allows for much better optimizations than const does, but there _are_ cases where const allows for optimizations.

- Jonathan M Davis
April 27, 2012
On 27/04/12 20:39, Steven Schveighoffer wrote:
> No, just make sure all the parameters and the result are either immutable or
> implicitly castable to immutable (hard to explain this better).
>
> Hm... gives me a thought that unit tests should have a helper that allows
> ensuring this:
>
> static assert(isStrongPure!fn);
>
> Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a library.

Not worth adding a strongpure (purest?) attribute to enable coders to explicitly mark their expectations?

> array bounds checks and asserts are turned off during -release compilation :)

Maybe I've misunderstood what that means, but I would still have expected the program to stop running at least.  What was it doing instead ... ? :-)
April 27, 2012
On Fri, 27 Apr 2012 15:28:48 -0400, Joseph Rushton Wakeling <joseph.wakeling@webdrake.net> wrote:

> On 27/04/12 20:39, Steven Schveighoffer wrote:
>> No, just make sure all the parameters and the result are either immutable or
>> implicitly castable to immutable (hard to explain this better).
>>
>> Hm... gives me a thought that unit tests should have a helper that allows
>> ensuring this:
>>
>> static assert(isStrongPure!fn);
>>
>> Or maybe __traits(isStrongPure, fn) if it's too difficult to do in a library.
>
> Not worth adding a strongpure (purest?) attribute to enable coders to explicitly mark their expectations?

In most cases it's not that important.  strong Pure is simply for optimization.

Plus, what that would do is guarantee that the function only compiles if the parameters and return value are implicitly convertible to immutable.  There is no other special requirements, and the parameters and return type are sitting right there, next to the strongpure function, it's likely not necessary to re-state the intent.

Note that, even if a strong pure function becomes a weak-pure one, it's still valid (and compiles, and runs).

It can be viewed similarly to inline, where the compiler makes the decision.  Sometimes it's nice to force the issue, but for the most part, the compiler will know better what should be inlined.

>> array bounds checks and asserts are turned off during -release compilation :)
>
> Maybe I've misunderstood what that means, but I would still have expected the program to stop running at least.  What was it doing instead ... ? :-)

It didn't crash because the memory it accessed was still part of the program's address space.  out of bounds references don't have to necessarily point at invalid memory.

-Steve
April 27, 2012
On Fri, Apr 27, 2012 at 08:29:40PM +0200, Joseph Rushton Wakeling wrote:
> On 27/04/12 20:25, H. S. Teoh wrote:
> >On Fri, Apr 27, 2012 at 07:25:30PM +0200, Joseph Rushton Wakeling wrote:
> >>I was more concerned that the compiler wasn't identifying what to me was a violation of purity.  I'm fairly sure I can also find a way to make some of those "nothrow" functions throw an error ...
> >
> >It's not a violation of purity, it's just "weak purity". If you try to access a global variable, for example, it will trigger an error.
> 
> Thanks for the extended description of weak purity -- it's been very helpful in understanding the concept better.
> 
> Is there a particular way in which I can explicitly mark a function as strongly pure?

The "strong/weak" distinction is deduced by the compiler internally.


> >And nothrow functions *are* allowed to throw Error objects. That's also a deliberate decision. :-)
> 
> ... yes, as I just found out when I decided to test it 2 minutes ago :-)  OTOH I found that with or without the nothrow option, when the -release flag was used in compiling the code, the error was not thrown and the program did not exit -- it just sat there seemingly running but doing nothing.  This was unexpected ...
> 
> The deliberate error was in this case a range exception when accessing an array.

In -release mode, array bounds checking is turned off for speed reasons. The idea being that before you compile with -release you've already extensively tested your program and are sure that simple bugs like array overruns have been weeded out.


T

-- 
IBM = I Blame Microsoft
April 27, 2012
On Friday, April 27, 2012 14:00:13 H. S. Teoh wrote:
> In -release mode, array bounds checking is turned off for speed reasons. The idea being that before you compile with -release you've already extensively tested your program and are sure that simple bugs like array overruns have been weeded out.

The checks are actually left in for @safe code even with -release. You need - noboundscheck to disable themin all code. But yes, many array bounds checks do get stripped out with -release, and you can never rely on them being there, because they can go away, depending on the flags.

- Jonathan M Davis