purity question (page 3)

Rene Zwanenburg wrote: > On Tuesday, 30 May 2017 at 11:34:52 UTC, ketmar wrote: >>> If malloc were marked as pure, wouldn't that mean it must return the same pointer every time you call it with the same size? >> >> of course. but D "pure" is not what other world knows as "pure". we love to mess with words. > > Well, there's the ability to modify non-const reference parameters from a pure function, but that's not applicable to malloc. Are there any other special rules? "pure" methods can mutate object state.

On Tuesday, 30 May 2017 at 13:45:07 UTC, Rene Zwanenburg wrote: > On Tuesday, 30 May 2017 at 11:34:52 UTC, ketmar wrote: >>> If malloc were marked as pure, wouldn't that mean it must return the same pointer every time you call it with the same size? >> >> of course. but D "pure" is not what other world knows as "pure". we love to mess with words. > > Well, there's the ability to modify non-const reference parameters from a pure function, but that's not applicable to malloc. Are there any other special rules? The rules[0] are: 0) Can't call functions not marked pure. 1) Can't touch anything mutable that's not explicitly passed to the pure function. 1b) Except GC internal state - i.e. memory can be allocated via the GC. 1c) And floating-point exception flags and modes. 2) Can't do I/O (can be seen as a special case of 1 and 0). There's a few more details, but that's the important stuff. For a good article on the subject, I recommend David Nadlinger's Purity in D: http://klickverbot.at/blog/2012/05/purity-in-d/ [0]: https://dlang.org/spec/function.html#pure-functions

On 05/30/2017 11:12 AM, Rene Zwanenburg wrote: > If malloc were marked as pure, wouldn't that mean it must return the same pointer every time you call it with the same size? D's `pure` mostly means: "does not access mutable state, and does not do input/output". There is never a requirement that a function must return the same value for the same input. But a compiler is allowed to memoize the result of a `pure` function when it has no mutable indirections in its parameter and return types. Such a function is "strongly pure". When there are mutable indirections, the function is "weakly pure". Weakly pure functions are not assumed to be memoizable, but "weakly pure" still has meaning: * Can call weakly pure functions from strongly pure ones. * When a weakly pure function has mutable indirections in the return type but not in the parameters (like malloc), then it must be returning freshly allocated memory. That means, the result cannot be referenced from anywhere else. So it can be converted to const/immutable/shared implicitly. The spec calls that a "pure factory function". Repeating Biotronic's links, the spec and David Nadlinger's article are the go-to resources for D's take on purity: https://dlang.org/spec/function.html#pure-functions http://klickverbot.at/blog/2012/05/purity-in-d/

On Tuesday, May 30, 2017 16:54:13 ag0aep6g via Digitalmars-d-learn wrote: > On 05/30/2017 11:12 AM, Rene Zwanenburg wrote: > > If malloc were marked as pure, wouldn't that mean it must return the same pointer every time you call it with the same size? > > D's `pure` mostly means: "does not access mutable state, and does not do input/output". > > There is never a requirement that a function must return the same value for the same input. But a compiler is allowed to memoize the result of a `pure` function when it has no mutable indirections in its parameter and return types. Such a function is "strongly pure". > > When there are mutable indirections, the function is "weakly pure". Weakly pure functions are not assumed to be memoizable, but "weakly pure" still has meaning: > > * Can call weakly pure functions from strongly pure ones. > > * When a weakly pure function has mutable indirections in the return type but not in the parameters (like malloc), then it must be returning freshly allocated memory. That means, the result cannot be referenced from anywhere else. So it can be converted to const/immutable/shared implicitly. The spec calls that a "pure factory function". > > Repeating Biotronic's links, the spec and David Nadlinger's article are the go-to resources for D's take on purity: > > https://dlang.org/spec/function.html#pure-functions http://klickverbot.at/blog/2012/05/purity-in-d/ Yeah, basically, D's pure was originally what is now sometimes called "strongly pure," which is quite close to functionally pure in that the same input results in the same output (it still allows alocating memory and returning it though, so you can have equal but different objects if the same function is called multiple times with the same arguments in different contexts where the compiler doesn't memoize it). However, pure was so restrictive as to be borderline useless, because it could only pass types that were immutable or implicitly convertible to immutable. So, it was expanded to include any function which could not access global, mutable state, because such functions can be called from a "strongly" pure function without violating the strongly pure function's guarantees - and thus "weakly" pure functions were born, making it so that D's pure is really more like @noglobal than functionally pure. It's a critical building block in functional purity, and strongly pure functions still get all of the benefits that they got before (and now they can actually be useful, because they can call many more functions), but _most_ pure functions are weakly pure and thus don't get the full benefits of functional purity - and that's why some folks like Ketmar tend to get annoyed with D's pure. But the way it is is actually quite useful even if it's initially confusing. And even when there are no strongly pure functions in your program, knowing that a function cannot access global variables except through its arguments means a lot in and of itself. - Jonathan M Davis

May 30, 2017

Re: purity question

Posted by H. S. Teoh

Permalink

H. S. Teoh

Permalink

On Tue, May 30, 2017 at 11:10:19AM -0700, Jonathan M Davis via Digitalmars-d-learn wrote: [...]
> Yeah, basically, D's pure was originally what is now sometimes called "strongly pure," which is quite close to functionally pure in that the same input results in the same output (it still allows alocating memory and returning it though, so you can have equal but different objects if the same function is called multiple times with the same arguments in different contexts where the compiler doesn't memoize it). However, pure was so restrictive as to be borderline useless, because it could only pass types that were immutable or implicitly convertible to immutable. So, it was expanded to include any function which could not access global, mutable state, because such functions can be called from a "strongly" pure function without violating the strongly pure function's guarantees - and thus "weakly" pure functions were born, making it so that D's pure is really more like @noglobal than functionally pure. It's a critical building block in functional purity, and strongly pure functions still get all of the benefits that they got before (and now they can actually be useful, because they can call many more functions), but _most_ pure functions are weakly pure and thus don't get the full benefits of functional purity - and that's why some folks like Ketmar tend to get annoyed with D's pure. But the way it is is actually quite useful even if it's initially confusing. And even when there are no strongly pure functions in your program, knowing that a function cannot access global variables except through its arguments means a lot in and of itself.
[...]

In retrospect, we could have named "weakly pure" something like @noglobal, and "strongly pure" as pure.

But be that as it may, I think D's pure system is actually extremely ingenious.  Look at it this way: in the classical sense of functional purity, which is what you get in (pure) functional languages, it can get quite difficult to implement the functionality you want, because the language enforces that every primitive you use in your implementation must be pure.  The reasoning is that if you're only allowed to use pure primitives, then the resulting function is guaranteed to be pure.  It's nice and simple. However, it's also rather cumbersome, especially in the context of an imperative language like D.

For example, in a functional language you cannot assign new values to variables, and you cannot write for-loops, because the loop index is not allowed to mutate. You cannot mutate anything, because mutation makes the code impure. So instead of straightforward loops, you need to resort to things like tail recursion; instead of mutation, you need to use monads, and so on.  I'm not saying this is a bad thing, but it's just cumbersome because you, the programmer, cannot simply implement a function with a straightforward algorithm, but you have to work harder to express the algorithm in functional terms using recursion and other functional (pure) constructs.

The first insight in D's purity system is the observation that, given some function f(), from the caller's POV all they care about is that (1) the function always returns the same value given the same arguments, and (2) there are no visible side-effects.  Point (2) is where the insight lies: in a sense, it *doesn't matter* if f() does all kinds of impure stuff in order to compute its return value, as long as the outside world cannot see it.  It may be internally impure, but externally, as far as the outside world is concerned, it's pure, because the caller can't tell the difference.  So D allows things like variables, mutation, for-loops, and all kinds of stuff inside (strongly) pure functions -- as long as no global state is touched, and as long as the function arguments aren't modified (from the caller's POV), f() is pure.  In other words, as long as f() keeps its dirty laundry to itself and leaves the outside world untouched, it is externally pure, even if it's internally impure.

The next insight is this: suppose f() calls g(), and g() is impure.  In
the traditional functional language purity system, this is outright
illegal, because a pure function, by definition, cannot call an impure
function, otherwise it is itself impure.  However, suppose g() does not
modify any global state.  It *may* modify stuff through its arguments --
which makes it impure.  But if f() is not allowed to modify its
arguments and has no global state, then at worst, it can only pass its
internal, non-global state to g(). Therefore, it is impossible for g()
to reach global state through its arguments. Thus, f()'s (external)
purity is preserved.

Note the fine distinction here, that g()'s external purity is keyed on
being called from inside a strongly-pure function f(). If f() is impure,
then there is no guarantee that g() may modify global state (because f()
may pass a reference to a global variable to g(), and g() modifies that
global via the reference).

In D terminology, we say g() is "weakly pure" -- its purity is dependent
on the context in which it's called.  We say f() is "strongly pure",
because it is (externally) pure in the traditional functional sense that
its return value is cacheable and it has no (observable) side-effects.

(This similar to nothrow functions being allowed to throw exceptions internally -- as long as the exceptions are caught before the function returns.  Internally, the function does throw exceptions; but externally it's nothrow because the caller cannot ever see said exceptions.)

//

Why go through all of this complicated reasoning?  Because it expands the horizons of what a strongly-pure function can do, *while still remaining externally pure*.  A strongly pure function is now allowed to call a weakly-pure (i.e., @noglobal) function, rather than being restricted to only calling other strongly-pure functions.  Allowing more functions to be callable from inside a strongly-pure function means we can do much more inside a strongly-pure function, without breaking its (external) purity. So we can still enjoy the benefits of strong (classical) purity, like cacheability, etc., without having our hands tied in the implementation of the function.

It's true that the `pure` keyword can be misleading for newcomers, because unfortunately, due to historical reasons, `pure` is used for both "strongly pure" and "weakly pure" functions.  In fact, in the current compiler implementation, `pure` only means "weakly pure"; it's just that the compiler only applies optimizations afforded by strong purity when the function arguments are immutable or implicitly convertible to immutable.  If we could have redesigned the language, we could have called it by a less confusing name, like @noglobal, with `pure` reserved for strongly-pure functions.


T

-- 
An elephant: A mouse built to government specifications. -- Robert Heinlein

Forums