May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Don | Don wrote: > Bruno Medeiros wrote: >> Don wrote: >>> Bruno Medeiros wrote: >>>> I want to conceptualize an idea that has been briefly mentioned in the previous pure discussions (by Don I think?), but has not been explicitly brought into attention. >>>> >>>> As we know, the current definition of pure functions (according to http://www.digitalmars.com/d/2.0/accu-functional.pdf) is that they can only access invariant, or local mutable data. This means that all of the function's parameters have to be invariant. >>>> >>>> The idea is: let's also allow pure functions to have (and access) non-invariant parameters. These won't be normal pure functions, but instead "partially" pure functions. Their semantics are: the function promises not to change any external/global data, except for the parameters that are not invariant. Example: >>> >>> The first case to consider is nested functions. >>> >> >> Nested functions are a more complicated case than regular functions, as they have other "inputs" other than the parameters: the outer variables. >> So I think that if you implement partial pure semantics for nested functions, then you have it working for normal functions as well. > > I don't think so, in practice. Nested functions are a lot easier to implement, since while compiling the outer function, the compiler always has access to the complete code of the inner function and can confirm that the outer function remains pure. It's impossible to change the nested function in a way that accidentally violates purity of the outer function -- it just won't compile. > > But a regular function could appear in a completely different library, so the contextually purity needs to be enforced, and this would need a language change. > I state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure. The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant) The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this: Foo globalVar; pure void func(...) { localVar = globalVar + 1; // not allowed then just apply this same check on *any* expression, including function call expressions: partiallyPureFunction(localVar, globalVar); // not allowed partiallyPureFunction(localVar, localVar2); // but this is allowed -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D | |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bruno Medeiros | "Bruno Medeiros" wrote > Steven Schveighoffer wrote: >> >> However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types. >> > > The compiler can only infer partial purity if it has access to the function body (just the same as it can infer normal purity). But if you only have a function signature, you need a keyword. I meant an *extra* keyword. If you mark a function as pure, it can be partially pure or fully pure depending on the arguments. However, having an extra keyword lets the developer decide whether he is expecting a function to be partially pure or not, so there are no surprises :) I'd vote for a new keyword, but this does not need to happen right away. > >> As for your idea, to be complete, I'd say there are different levels of partially pure functions. >> >> level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized. > > This is already a regular pure function. The idea that a pure function has to return an invariant is a misconception, it can safely return mutable data. It can me memoized just the same (although a copy would have to be made if the result is not invariant) I would hope that the compiler would not memoize if I returned a mutable piece of data, as I would generally expect to use this function as a way to initialize some data that I want to build with (in a pure function or otherwise). If I wanted it to memoize, I can return invariant. For example, to memoize on 'new' would be a waste, but 'new' can be considered a pure (arguably partially pure) function. Sometimes memoization is not desirable, especially if you are rarely calling the pure function with the same arguments. I guess it all depends on what you consider partially pure :) I believe the current rules as stated on D's website require invariant return data, or data that can be implicitly cast to invariant. -Steve | |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bruno Medeiros | "Bruno Medeiros" wrote
> I state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure.
>
> The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant)
>
> The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this:
>
> Foo globalVar;
>
> pure void func(...) {
> localVar = globalVar + 1; // not allowed
>
> then just apply this same check on *any* expression, including function call expressions:
>
> partiallyPureFunction(localVar, globalVar); // not allowed
> partiallyPureFunction(localVar, localVar2); // but this is allowed
What if localVar is a pointer to global data, or transitively contains a pointer to global data? There is no way to know whether localVar is a pointer to global data or 'contained' heap data without lots of context checking. If partially pure functions with mutable parameters were allowed, I'd say it's a requirement that you can only call them from partially or fully pure functions, as local pointers from those functions are guaranteed to be contained within the function.
-Steve
| |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > "Bruno Medeiros" wrote >> I state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure. >> >> The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant) >> >> The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this: >> >> Foo globalVar; >> >> pure void func(...) { >> localVar = globalVar + 1; // not allowed >> >> then just apply this same check on *any* expression, including function call expressions: >> >> partiallyPureFunction(localVar, globalVar); // not allowed >> partiallyPureFunction(localVar, localVar2); // but this is allowed > > What if localVar is a pointer to global data, or transitively contains a pointer to global data? There is no way to know whether localVar is a pointer to global data or 'contained' heap data without lots of context checking. Whatever checks you have to make for localVar, you still have to do them with normal pure. (And not that it matters to the point, how do you initialize a local var with global data?) > If partially pure functions with mutable parameters were allowed, I'd say it's a requirement that you can only call them from partially or fully pure functions, as local pointers from those functions are guaranteed to be contained within the function. > I'm not sure I understand what you're saying. Are you saying that partially pure functions should not be called from *non-pure* functions? What's the sense in that? -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D | |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bruno Medeiros | "Bruno Medeiros" wrote
> Steven Schveighoffer wrote:
>> "Bruno Medeiros" wrote
>>> I state that if you have 'pure' implemented (in a sensible manner), it takes zero effort to implement partial/contextual pure.
>>>
>>> The definition of partial pure functions requires no extra effort from the compiler to check. (The *same* check for normal pure functions is made: that no external state is accessed unless invariant)
>>>
>>> The usage of partial pure functions also requires no extra effort. Inside a pure function body, the compiler already has to check if no external, non-invariant variables are accessed, like this:
>>>
>>> Foo globalVar;
>>>
>>> pure void func(...) {
>>> localVar = globalVar + 1; // not allowed
>>>
>>> then just apply this same check on *any* expression, including function call expressions:
>>>
>>> partiallyPureFunction(localVar, globalVar); // not allowed
>>> partiallyPureFunction(localVar, localVar2); // but this is allowed
>>
>> What if localVar is a pointer to global data, or transitively contains a pointer to global data? There is no way to know whether localVar is a pointer to global data or 'contained' heap data without lots of context checking.
>
> Whatever checks you have to make for localVar, you still have to do them with normal pure. (And not that it matters to the point, how do you initialize a local var with global data?)
>
>> If partially pure functions with mutable parameters were allowed, I'd say it's a requirement that you can only call them from partially or fully pure functions, as local pointers from those functions are guaranteed to be contained within the function.
>>
>
> I'm not sure I understand what you're saying. Are you saying that partially pure functions should not be called from *non-pure* functions? What's the sense in that?
From the digitalmars web site:
"A pure function can throw exceptions and allocate memory via a NewExpression."
So in the case of a partially pure function that takes a char array:
pure void f(char[] x) {...}
Let's say I have a non-pure function g:
void g()
{
char[] v;
...
f(v);
}
In the ..., v can be set to local mutable heap data (via new char[x] or whatever), or it could be set to point to a global mutable string. How can the compiler be sure without severe context analysis? The point is, in order to ensure context-free lexical analysis by the compiler, it must disallow calling f from a non-pure function.
Contrast that from a pure version of g:
pure void g()
{
char[] v;
...
f(v);
}
Because g is now pure, it is a requirement that v cannot be assigned to some global variable. The fact that g is pure ensures that f can safely be called. v can only be set by calling a new expression.
-Steve
| |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > "Bruno Medeiros" wrote >> Steven Schveighoffer wrote: >>> However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types. >>> >> The compiler can only infer partial purity if it has access to the function body (just the same as it can infer normal purity). But if you only have a function signature, you need a keyword. > > I meant an *extra* keyword. If you mark a function as pure, it can be partially pure or fully pure depending on the arguments. However, having an extra keyword lets the developer decide whether he is expecting a function to be partially pure or not, so there are no surprises :) I'd vote for a new keyword, but this does not need to happen right away. > That decision of "whether he is expecting a function to be partially pure or " pure is not a useful decision to have to make, it's redundant (and thus cumbersome) to ask the programmer to specify that. Also, having two different keywords would impossibilitate or difficult templating of constness (like with the "scoped const" proposal). Just think of 'pure' as meaning: this function does not depend on, or modify, external state. (external state being any data other than the parameters) >>> As for your idea, to be complete, I'd say there are different levels of partially pure functions. >>> >>> level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized. >> This is already a regular pure function. The idea that a pure function has to return an invariant is a misconception, it can safely return mutable data. It can me memoized just the same (although a copy would have to be made if the result is not invariant) > > I would hope that the compiler would not memoize if I returned a mutable piece of data, as I would generally expect to use this function as a way to initialize some data that I want to build with (in a pure function or otherwise). If I wanted it to memoize, I can return invariant. For example, to memoize on 'new' would be a waste, but 'new' can be considered a pure (arguably partially pure) function. Sometimes memoization is not desirable, especially if you are rarely calling the pure function with the same arguments. > I said the compiler *could* memoize the result just the same. Doesn't mean it *should* memoize if the result is mutable. As a matter of fact, even if the result is invariant, doesn't mean the compiler should memoize it. Memoization is a complicated optimization that likely only the user knows when it should be applied, not the compiler. > I guess it all depends on what you consider partially pure :) I believe the current rules as stated on D's website require invariant return data, or data that can be implicitly cast to invariant. > > -Steve > > In Andrei's latest presentation (http://www.digitalmars.com/d/2.0/accu-functional.pdf) when he lists the requirements for pure, it does not say that is should return invariant data. (and I'm assuming those requirements are complete) -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D | |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Steven Schveighoffer | Steven Schveighoffer wrote: > > > So in the case of a partially pure function that takes a char array: > > pure void f(char[] x) {...} > > Let's say I have a non-pure function g: > > void g() > { > char[] v; > ... > f(v); > } > > In the ..., v can be set to local mutable heap data (via new char[x] or whatever), or it could be set to point to a global mutable string. How can the compiler be sure without severe context analysis? The point is, in order to ensure context-free lexical analysis by the compiler, it must disallow calling f from a non-pure function. > Huh? You're mixing things up. Yes, if g() is not pure, then the compiler will not check if v points to global data or not. Thus we cannot guarantee that calling "f(v)" will not change global state. But so what? That doesn't mean we shouldn't allow it. It just means such call may have side-effects, and the compiler should threat that as a normal function call (make no particular pure optimizations). Thus "partially" pure: pure in some contexts, impure in other contexts. -- Bruno Medeiros - Software Developer, MSc. in CS/E graduate http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D | |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bruno Medeiros | "Bruno Medeiros" wrote > Steven Schveighoffer wrote: >> "Bruno Medeiros" wrote >>> Steven Schveighoffer wrote: >>>> However, I tend to agree with Don. What you want is a relaxation of the rules, which can easily happen after pure functions are supported. In fact, you don't even need any new keywords or syntax, the compiler just implies the partial purity from the parameter/return types. >>>> >>> The compiler can only infer partial purity if it has access to the function body (just the same as it can infer normal purity). But if you only have a function signature, you need a keyword. >> >> I meant an *extra* keyword. If you mark a function as pure, it can be partially pure or fully pure depending on the arguments. However, having an extra keyword lets the developer decide whether he is expecting a function to be partially pure or not, so there are no surprises :) I'd vote for a new keyword, but this does not need to happen right away. >> > > That decision of "whether he is expecting a function to be partially pure or " pure is not a useful decision to have to make, it's redundant (and thus cumbersome) to ask the programmer to specify that. It's not possible to call a partially pure function with mutable arguments from a non-pure function. Doing so would be a nightmare for the compiler. So from a maintainability perspective, changing a function from fully pure to partially pure changes the contract that the function provides. However, changing from fully pure to partially pure involves changing an argument type, so in essence, code that used the fully pure function would break anyways... Having an extra keyword for 'partially pure' is probably not necessary, so I agree with you now. > > Also, having two different keywords would impossibilitate or difficult templating of constness (like with the "scoped const" proposal). > > Just think of 'pure' as meaning: this function does not depend on, or modify, external state. (external state being any data other than the parameters) For D, there is also the notion that a pure function will not be affected by it's parameters changing mid-way through the function. >>>> As for your idea, to be complete, I'd say there are different levels of partially pure functions. >>>> >>>> level 1: mutable return value, but const/invariant parameters. These functions can be re-ordered, and can be called from pure or unpure functions. The result cannot be memoized. >>> This is already a regular pure function. The idea that a pure function has to return an invariant is a misconception, it can safely return mutable data. It can me memoized just the same (although a copy would have to be made if the result is not invariant) >> >> I would hope that the compiler would not memoize if I returned a mutable piece of data, as I would generally expect to use this function as a way to initialize some data that I want to build with (in a pure function or otherwise). If I wanted it to memoize, I can return invariant. For example, to memoize on 'new' would be a waste, but 'new' can be considered a pure (arguably partially pure) function. Sometimes memoization is not desirable, especially if you are rarely calling the pure function with the same arguments. >> > > I said the compiler *could* memoize the result just the same. Doesn't mean > it *should* memoize if the result is mutable. > As a matter of fact, even if the result is invariant, doesn't mean the > compiler should memoize it. Memoization is a complicated optimization that > likely only the user knows when it should be applied, not the compiler. I suspect the goal of Walter is to not have the developer make the decision to memoize... Similar to how we do not have the ability to force inlining, but we have the ability to force NOT inlining (by making a function opaque). I believe making a pure function return mutable heap data should be a signal that the compiler does not memoize the result. >> I guess it all depends on what you consider partially pure :) I believe the current rules as stated on D's website require invariant return data, or data that can be implicitly cast to invariant. > > In Andrei's latest presentation > (http://www.digitalmars.com/d/2.0/accu-functional.pdf) > when he lists the requirements for pure, it does not say that is should > return invariant data. (and I'm assuming those requirements are complete) Those requirements are not complete. For instance, the web site says that pure functions can call new expressions, but that is not stated in the pdf (or at least, I don't remember seeing it). I think it will eventually be stated that pure functions initially will have to return invariant data, or data that can be implicitly cast to invariant. Unless I can convince Andrei otherwise :) -Steve | |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bruno Medeiros | "Bruno Medeiros" wrote
> Steven Schveighoffer wrote:
>>
>>
>> So in the case of a partially pure function that takes a char array:
>>
>> pure void f(char[] x) {...}
>>
>> Let's say I have a non-pure function g:
>>
>> void g()
>> {
>> char[] v;
>> ...
>> f(v);
>> }
>>
>> In the ..., v can be set to local mutable heap data (via new char[x] or whatever), or it could be set to point to a global mutable string. How can the compiler be sure without severe context analysis? The point is, in order to ensure context-free lexical analysis by the compiler, it must disallow calling f from a non-pure function.
>>
>
> Huh? You're mixing things up.
> Yes, if g() is not pure, then the compiler will not check if v points to
> global data or not. Thus we cannot guarantee that calling "f(v)" will not
> change global state. But so what?
> That doesn't mean we shouldn't allow it. It just means such call may have
> side-effects, and the compiler should threat that as a normal function
> call (make no particular pure optimizations). Thus "partially" pure: pure
> in some contexts, impure in other contexts.
OK, I think I see what the barrier here is. Walter and Andrei's version of pure is supposed to be for solving multithreading issues. There is an attribute of D pure functions that is inherently true in functional programming languages. Not only does a pure function have no side effects, but nothing outside the pure function can affect its execution. If v points to global data, and some other thread changes that data mid-way through the function execution, the function might crash or throw an exception because it is expecting to be the only one that can change it's own data. With that in mind, partially pure functions that have mutable parameters cannot have that attribute unless it is guaranteed that those mutable parameters won't be changed by anything outside the partially pure function. The only way to guarantee that is to only allow calling that function from another pure or partially pure function.
-Steve
| |||
May 01, 2008 Re: Idea: partially pure functions | ||||
|---|---|---|---|---|
| ||||
Posted in reply to Bruno Medeiros | "Bruno Medeiros" wrote
> Yes, if g() is not pure, then the compiler will not check if v points to
> global data or not. Thus we cannot guarantee that calling "f(v)" will not
> change global state. But so what?
> That doesn't mean we shouldn't allow it. It just means such call may have
> side-effects, and the compiler should threat that as a normal function
> call (make no particular pure optimizations). Thus "partially" pure: pure
> in some contexts, impure in other contexts.
I re-read this and I see what you are proposing.
I can see how that would work, but you would definitely need another keyword. Any time a developer sees the pure keyword, he will assume he doesn't need to worry about thread issues. However, a partially pure function would be subject to thread issues when not called from a pure function.
I still think it would be a worthwhile feature to have, as it would save on implementing lots of functionality more than once.
BTW, forget my other post, we were arguing about totally different things.
-Steve
| |||
Copyright © 1999-2021 by the D Language Foundation
Permalink
Reply