May 18, 2014
On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d wrote:
> On Thu, 15 May 2014 08:43:11 -0700
> Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
> wrote:
> 
> > On 5/15/14, 6:28 AM, Dicebot wrote:
> > > This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern.
> >
> > I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
> 
> Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. However, that should have no effect on pure functions that return value types - even if the function took pointers or references as arguments or allocated memory internally. They should but perfectly memoizable.
[...]

	bool func(int x) /* pure? */ {
		int[] a, b;
		a = new int[x];
		b = new int[x];
		return (a.ptr < b.ptr);
	}

Where do you draw the line?


T

-- 
Tech-savvy: euphemism for nerdy.
May 19, 2014
On Sun, 18 May 2014 06:58:25 -0700
"H. S. Teoh via Digitalmars-d" <digitalmars-d@puremagic.com> wrote:

> On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d wrote:
> > On Thu, 15 May 2014 08:43:11 -0700
> > Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
> > wrote:
> >
> > > On 5/15/14, 6:28 AM, Dicebot wrote:
> > > > This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern.
> > >
> > > I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
> >
> > Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it. However, that should have no effect on pure functions that return value types - even if the function took pointers or references as arguments or allocated memory internally. They should but perfectly memoizable.
> [...]
>
>   bool func(int x) /* pure? */ {
>       int[] a, b;
>       a = new int[x];
>       b = new int[x];
>       return (a.ptr < b.ptr);
>   }
>
> Where do you draw the line?

Hmmmm. I think that it was pointed out somewhere else in this thread that that sort of comparison is already undefined in C (and thus probably D) - certainly it's not something that's really valid to do normally. However, there are other things that we currently do "normally" which have similar problems (e.g. toHash on Object hashing the reference). Maybe if we could come up with a set of operations which weren't valid in a pure function because they'd behave differently depending on which memory block was given to new, then we could make it work. But we may simply need to say that memoization of pure functions just isn't going to work if we allow allocations to take place in pure functions. That wouldn't be ideal, but I'm also not convinced that it matters much.

It's already so rare that memoization of a function call can occur, that I'm pretty much convinced that memoization is useless as an optimization - at least as long as the compiler is doing it. After all, how often does a function get called with same the arguments within a single function let alone a single expression (and as I understand it, memoization only ever occurs at this point within either a single expression or statement - I don't remember which - but regardless, it's not even within a whole function)? And since there's no way that the compiler is going to memoize alls to a function across functions or even across calls to the same function which is calling the function which is being memoized, unless it's very common to call a function with the same result within a single function (and I really don't think that it is), then memoization by the compiler is really of minimal benefit, much as it would be nice to have it where we can.

Regardless, we should error on the side of not memoizing in order to avoid undefined behavior due to memory allocations causing the pure function to act slightly differently across function calls (even when it's given the same arguments).

- Jonathan M Davis

May 19, 2014
On Monday, 19 May 2014 at 01:19:29 UTC, Jonathan M Davis via Digitalmars-d wrote:
> It's already so rare that memoization of a function call can occur, that I'm
> pretty much convinced that memoization is useless as an optimization - at
> least as long as the compiler is doing it. After all, how often does a
> function get called with same the arguments within a single function let alone
> a single expression (and as I understand it, memoization only ever occurs at
> this point within either a single expression or statement - I don't remember
> which - but regardless, it's not even within a whole function)?

Memoization is valid throughout the program. Opportunities occur frequently with generic programming and inlining.

> Regardless, we should error on the side of not memoizing in order to avoid

Then you don't need strict pure functions.
May 19, 2014
On Mon, 19 May 2014 05:16:13 +0000
via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Monday, 19 May 2014 at 01:19:29 UTC, Jonathan M Davis via Digitalmars-d wrote:
> > It's already so rare that memoization of a function call can
> > occur, that I'm
> > pretty much convinced that memoization is useless as an
> > optimization - at
> > least as long as the compiler is doing it. After all, how often
> > does a
> > function get called with same the arguments within a single
> > function let alone
> > a single expression (and as I understand it, memoization only
> > ever occurs at
> > this point within either a single expression or statement - I
> > don't remember
> > which - but regardless, it's not even within a whole function)?
>
> Memoization is valid throughout the program. Opportunities occur frequently with generic programming and inlining.

I seriously question that assertion. How often do you really call the same function with the same arguments? And given that for the memoization to be done by the compiler without saving the result somewhere (which could actually _harm_ efficiency in many cases and certainly isn't something that the compiler does right now), the most that it could possibly memoize would be within a single function, that makes it all the less likely that it could memoize any calls. And given that the most it currently memoizes would be within a single expression (or possibly a single statement - I can't remember which), that makes it so that about the only time that it can memoize function calls is when you do something like foo(2) * foo(2), and how often does _that_ happen? Sure, it happens from time to time, but in my experience, it's very rare to make the same function call with the same arguments within a single expression or statement.

> > Regardless, we should error on the side of not memoizing in order to avoid
>
> Then you don't need strict pure functions.

As far as I'm concerned the two big gains from pure are

1. it makes it easier to reason about code, because it guarantees that the function didn't access any global or static variables.

2. it allows us to implicitly convert to different levels of mutability for the return type of pure functions where the compiler can guarantee that the return value was allocated within the function.

Any other benefits we get from pure are great, but they're incidental in comparison. And in particular, in my experience, memoization opportunities are so rare that there's really no point in worrying about them. They're just a nice bonus when they do happen.

- Jonathan M Davis
May 19, 2014
On Monday, 19 May 2014 at 05:39:49 UTC, Jonathan M Davis via Digitalmars-d wrote:
> 1. it makes it easier to reason about code, because it guarantees that the
> function didn't access any global or static variables.

It can, through the parameters, like an array of pointers. And avoiding IO is not sufficient to mark 90% of my code as weakly pure.

> 2. it allows us to implicitly convert to different levels of mutability for
> the return type of pure functions where the compiler can guarantee that the
> return value was allocated within the function.

But if you can have a struct/pointer as a parameter then you can clearly return objects not allocated in the function?
May 19, 2014
On Mon, 19 May 2014 06:05:26 +0000
via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Monday, 19 May 2014 at 05:39:49 UTC, Jonathan M Davis via Digitalmars-d wrote:
> > 1. it makes it easier to reason about code, because it
> > guarantees that the
> > function didn't access any global or static variables.
>
> It can, through the parameters, like an array of pointers. And avoiding IO is not sufficient to mark 90% of my code as weakly pure.

Except that if most of your code is marked pure, then there aren't very many points where it could access a global or static variable. And regardless of that, the fact that the only way to access a global or static variable in a pure function is through one of its arguments still guarantees that the function isn't messing with anything that wasn't passed to it, so that still helps a lot with being able to reason about code. Sure, it could still be passed an argument that points to a global variable (directly or indirectly), but then you only have to worry about globals being accessed if they were passed in (directly or indirectly). Sure, it's not as big a gain as when a function is strongly pure, but it's good enough for most cases.

> > 2. it allows us to implicitly convert to different levels of
> > mutability for
> > the return type of pure functions where the compiler can
> > guarantee that the
> > return value was allocated within the function.
>
> But if you can have a struct/pointer as a parameter then you can clearly return objects not allocated in the function?

The compiler is smart enough in many cases to determine whether the return value could have been passed in or not (though it wouldn't surprise me if it could be made smarter in that regard). With a function like

string foo(string bar) pure {...}

it can't assume that the return type is unique, because it could have been passed in via the parameter, but with

string foo(char[] bar) pure {..}

or

int* foo(string bar) pure {..}

it could, because it's impossible for the parameter to be returned from the function (unless casting that breaks the type system is used anyway - and the compiler is free to assume that that isn't done). So, it varies quite a bit as to whether a pure function is guaranteed to be returning newly allocated memory or not, but the compiler often can determine that, and when it can, it makes dealing with immutable far, far more pleasant. It's particularly useful when you need to allocate an immutable object but also need to mutate it as part of initializing it. If you do it in a pure function where the compiler knows that the object couldn't have been passed in, then the return type can be freely converted to various levels of mutability - including immutable - without having to use immutable within the function.

- Jonathan M Davis
May 19, 2014
On Monday, 19 May 2014 at 06:30:46 UTC, Jonathan M Davis via Digitalmars-d wrote:
> makes dealing with immutable far, far more pleasant. It's particularly useful
> when you need to allocate an immutable object but also need to mutate it as
> part of initializing it. If you do it in a pure function where the compiler
> knows that the object couldn't have been passed in, then the return type can
> be freely converted to various levels of mutability - including immutable -
> without having to use immutable within the function.

It does not appear as a clean design that functions should have different semantics than a block. What matters is that the object reference is unique. Confusing this with pure seems like a bad idea.
May 19, 2014
On Mon, 19 May 2014 07:37:55 +0000
via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Monday, 19 May 2014 at 06:30:46 UTC, Jonathan M Davis via Digitalmars-d wrote:
> > makes dealing with immutable far, far more pleasant. It's
> > particularly useful
> > when you need to allocate an immutable object but also need to
> > mutate it as
> > part of initializing it. If you do it in a pure function where
> > the compiler
> > knows that the object couldn't have been passed in, then the
> > return type can
> > be freely converted to various levels of mutability - including
> > immutable -
> > without having to use immutable within the function.
>
> It does not appear as a clean design that functions should have different semantics than a block. What matters is that the object reference is unique. Confusing this with pure seems like a bad idea.

I don't follow you. The fact that D's pure helps the compiler determine cases where it knows that the return value of a function is unique is a key feature of pure and has proven to be a great idea. Perhaps you're hung up on the fact that the term "pure" is being used, and you're thinking about functional purity? If so, forget about pure in the functional sense if you want to discuss D purity. You need to think of it as something more like @noglobal. That combined with other information in the function signature allows the compiler to determine cases where it knows that the returned value is unique. It also can lead to the compiler determining that a function in functionally pure and thus memoizable, but at this point, that's pretty incidental to what pure is and does. It's part of it, but it's not the primary feature of what D's pure is or is for.  It's unfortunate that the language's evolution lead us to using the term pure for what it's currently used for, but we're pretty much stuck with it at this point. Regardless, the fact that D's pure allows us to determine when the return value of a function has to be unique is fantastic and has proven very helpful.

- Jonathan M Davis
May 19, 2014
On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d <digitalmars-d@puremagic.com> wrote:

> On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d wrote:
>> On Thu, 15 May 2014 08:43:11 -0700
>> Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
>> wrote:
>>
>> > On 5/15/14, 6:28 AM, Dicebot wrote:
>> > > This is not true. Because of such code you can't ever
>> > > automatically memoize strongly pure function results by compiler.
>> > > A very practical concern.
>> >
>> > I think code that doesn't return pointers should be memoizable.
>> > Playing tricks with pointer comparisons would be appropriately
>> > penalized. -- Andrei
>>
>> Agreed. The fact that a pure function can return newly allocated
>> memory pretty much kills the idea of being able to memoize pure
>> functions that return pointers or references, because the program's
>> behavior would change if it were to memoize the result and reuse it.

Memoizing reference returns that are immutable should be fine.

>> However, that should have no effect on pure functions that return
>> value types - even if the function took pointers or references as
>> arguments or allocated memory internally. They should but perfectly
>> memoizable.
> [...]
>
> 	bool func(int x) /* pure? */ {
> 		int[] a, b;
> 		a = new int[x];
> 		b = new int[x];
> 		return (a.ptr < b.ptr);
> 	}
>
> Where do you draw the line?

Definitely this is fine being pure, and can be memoized. It just won't do what you thought it would. So? :)

This is what Andrei meant as being "penalized."

-Steve
May 19, 2014
On Mon, 19 May 2014 09:42:31 -0400
Steven Schveighoffer via Digitalmars-d <digitalmars-d@puremagic.com>
wrote:

> On Sun, 18 May 2014 09:58:25 -0400, H. S. Teoh via Digitalmars-d <digitalmars-d@puremagic.com> wrote:
>
> > On Sat, May 17, 2014 at 11:51:44AM -0700, Jonathan M Davis via Digitalmars-d wrote:
> >> On Thu, 15 May 2014 08:43:11 -0700
> >> Andrei Alexandrescu via Digitalmars-d <digitalmars-d@puremagic.com>
> >> wrote:
> >>
> >> > On 5/15/14, 6:28 AM, Dicebot wrote:
> >> > > This is not true. Because of such code you can't ever automatically memoize strongly pure function results by compiler. A very practical concern.
> >> >
> >> > I think code that doesn't return pointers should be memoizable. Playing tricks with pointer comparisons would be appropriately penalized. -- Andrei
> >>
> >> Agreed. The fact that a pure function can return newly allocated memory pretty much kills the idea of being able to memoize pure functions that return pointers or references, because the program's behavior would change if it were to memoize the result and reuse it.
>
> Memoizing reference returns that are immutable should be fine.

Only if you consider it okay for the behavior of the function to change upon memoization - or at least don't consider that the behaviors changed due to the fact that the objects are equal but not the same object (which can matter for reference types) is something that matters. It has less of an impact when you're dealing with immutable objects, because changing the value of one won't change the value of another, but it can still change the behavior of the program due to the fact that they're not actually the same object. So, I'd be inclined to argue that no functions which return memory should be memoizable. And given that the compiler can only memoize functions within a single expression (or maybe statement - I can't remember which) - I don't think that that restriction even costs us much.

- Jonathan M Davis