purity and memory allocations/pointers - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » purity and memory allocations/pointers

Thread overview

purity and memory allocations/pointers
Aug 03, 2013 monarch_dodra
Aug 03, 2013 Timon Gehr
Aug 03, 2013 Meta
Aug 03, 2013 monarch_dodra
Aug 03, 2013 Lars T. Kyllingstad
Aug 03, 2013 John Colvin
Aug 04, 2013 John Colvin
Aug 10, 2013 H. S. Teoh
Aug 10, 2013 H. S. Teoh
Aug 10, 2013 Meta
Aug 10, 2013 H. S. Teoh
Aug 11, 2013 Meta
Aug 11, 2013 Meta
Aug 10, 2013 Jonathan M Davis
Aug 03, 2013 Timon Gehr
Aug 04, 2013 Meta
Aug 04, 2013 Timon Gehr
Aug 04, 2013 Meta
Aug 04, 2013 Timon Gehr
Aug 10, 2013 H. S. Teoh
Aug 10, 2013 H. S. Teoh
Aug 04, 2013 Timon Gehr
Aug 03, 2013 anonymous

August 03, 2013

purity and memory allocations/pointers

Posted by monarch_dodra

monarch_dodra

I'm a bit confused about where we draw the line with purity, and memory.

For example, take this "simple" function:

//----
char[] getMutableHello() pure
{
    return "hello".dup;
}
//----

This seems like it will *always* create the same result, yet at the same time, it calls the GC. "dup" could fail, amongst others, meaning there is failed purity.

Is this OK?

:::::::::::::::::::::::::::::::::::::::::

Where does purity mean when you are talking about slices/pointers? Simply that they *point*/*reference* the same payload? I mean:

void main()
{
    auto a = getMutableHello();
    auto b = getMutableHello();
    assert(a == b); //OK
    assert(a is b); //Derp :/
}

Did getMutableHello just violate its purity promise?

:::::::::::::::::::::::::::::::::::::::::

Another thing that bothers me, again, when interfacing with the GC, is "capacity". It is currently marked as pure, but I believe this is wrong: The GC can in-place extend the "useable" memory of a dynamic array. This means all referencing slices will be affected, yet un-modified. This means that calling "s.capacity" depends on the global state of the GC:

//----
    auto s1 = new ubyte[](12_000);
    auto bck = s1;
    size_t cap = s1.capacity;
    s1.reserve(13_000);
    assert(s1 is bck); //Apparently, s1 is not changed.
    assert(cap == s1.capacity); //This fails, yet capacity is pure?
//----

Again, is this OK?

:::::::::::::::::::::::::::::::::::::::::

One last question: Pointers.

int get(int* p) pure
{
    return *p;
}

void main()
{
    int i = 0;
    auto p = &i;
    get(p);
}

Here, get, to me, is obviously not pure, since it depends on the state of the global "i". *Where* did "get" go wrong? Did I simply "abusively" mark get as pure? Is the "pure" keyword's guarantee simply "weak"?

:::::::::::::::::::::::::::::::::::::::::

To sum up the question, regardless of how the *keyword* pure currently works, my question is "*How* should it behave"? We're currently marking things in Phobos as pure, and I'm worried the only reason we are doing it is that the keyword is lenient, and said functions *aren't* actually pure...

August 03, 2013

Re: purity and memory allocations/pointers

Posted by Timon Gehr
in reply to monarch_dodra

Timon Gehr

Posted in reply to monarch_dodra

On 08/03/2013 05:59 PM, monarch_dodra wrote:
> I'm a bit confused about where we draw the line with purity, and memory.
>
> For example, take this "simple" function:
>
> //----
> char[] getMutableHello() pure
> {
>      return "hello".dup;
> }
> //----
>
> This seems like it will *always* create the same result, yet at the same
> time, it calls the GC. "dup" could fail, amongst others, meaning there
> is failed purity.
>
> Is this OK?
>

Since a pure function call can overflow the stack, I don't really see this particular point as an issue. (D is Turing complete, unlike the machines it runs on.)

> :::::::::::::::::::::::::::::::::::::::::
>
> Where does purity mean when you are talking about slices/pointers?
> Simply that they *point*/*reference* the same payload? I mean:
>
> void main()
> {
>      auto a = getMutableHello();
>      auto b = getMutableHello();
>      assert(a == b); //OK
>      assert(a is b); //Derp :/
> }
>
> Did getMutableHello just violate its purity promise?
> ...

Well, no. It is by design. All purity really means is that the part of the existing store not reachable by following references in the arguments will not be mutated.

What is unfortunate in a sense is that 'is' can be used on immutable references. This often precludes common subexpression elimination for pure function calls even though such an optimization would be valid otherwise.

> :::::::::::::::::::::::::::::::::::::::::
>
> Another thing that bothers me, again, when interfacing with the GC, is
> "capacity". It is currently marked as pure, but I believe this is wrong:
> The GC can in-place extend the "useable" memory of a dynamic array. This
> means all referencing slices will be affected, yet un-modified. This
> means that calling "s.capacity" depends on the global state of the GC:
>
> //----
>      auto s1 = new ubyte[](12_000);
>      auto bck = s1;
>      size_t cap = s1.capacity;
>      s1.reserve(13_000);
>      assert(s1 is bck); //Apparently, s1 is not changed.
>      assert(cap == s1.capacity); //This fails, yet capacity is pure?
> //----
>
> Again, is this OK?
> ...

Well, not really.

IMO the issue here is that if D had a formal semantics, it would likely be indeterministic due to the GC interface. To make the execution of pure functions deterministic (which is not currently part of their guarantees), all operations that can witness indeterminism would need to be banned in pure functions (eg. ~=, ordering addresses, traversing hash tables etc.) It is not really possible to get rid of the non-determinism completely while keeping any kind of allocations, since allocations will return nondeterministic memory addresses. (Unless the formal semantics fully specifies the GC implementation. :o) )

Another stance that can be taken is that it is not really global state, since it is referenced by the slice. Then the above would be ok. (But semantically ugly.) Under that point of view, an issue is that capacity violates the transitivity of immutable.

> :::::::::::::::::::::::::::::::::::::::::
>
> One last question: Pointers.
>
> int get(int* p) pure
> {
>      return *p;
> }
>
> void main()
> {
>      int i = 0;
>      auto p = &i;
>      get(p);
> }
>
> Here, get, to me, is obviously not pure, since it depends on the state
> of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
> mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
> ...

Yes, it's weak.

> :::::::::::::::::::::::::::::::::::::::::
>
> To sum up the question, regardless of how the *keyword* pure currently
> works, my question is "*How* should it behave"? We're currently marking
> things in Phobos as pure, and I'm worried the only reason we are doing
> it is that the keyword is lenient, and said functions *aren't* actually
> pure...

I guess adding the requirement that a pure function cannot witness its own indeterminism is required to make the keyword legitimate.

August 03, 2013

Re: purity and memory allocations/pointers

Posted by anonymous
in reply to monarch_dodra

anonymous

Posted in reply to monarch_dodra

On Saturday, 3 August 2013 at 15:59:22 UTC, monarch_dodra wrote:
> I'm a bit confused about where we draw the line with purity, and memory.
>
> For example, take this "simple" function:
>
> //----
> char[] getMutableHello() pure
> {
>     return "hello".dup;
> }
> //----
>
> This seems like it will *always* create the same result, yet at the same time, it calls the GC. "dup" could fail, amongst others, meaning there is failed purity.
>
> Is this OK?

When `dup` fails, some Exception/Error is thrown and the function won't return. So, purity is not affected, I think. There's also `nothrow` when you want to rule out exceptions.

>
> :::::::::::::::::::::::::::::::::::::::::
>
> Where does purity mean when you are talking about slices/pointers? Simply that they *point*/*reference* the same payload? I mean:
>
> void main()
> {
>     auto a = getMutableHello();
>     auto b = getMutableHello();
>     assert(a == b); //OK
>     assert(a is b); //Derp :/
> }
>
> Did getMutableHello just violate its purity promise?

I'd say `pure` doesn't promise equality of memory addresses. I imagine it would be rather annoying if you couldn't `new` in pure functions.

>
> :::::::::::::::::::::::::::::::::::::::::
>
> Another thing that bothers me, again, when interfacing with the GC, is "capacity". It is currently marked as pure, but I believe this is wrong: The GC can in-place extend the "useable" memory of a dynamic array. This means all referencing slices will be affected, yet un-modified. This means that calling "s.capacity" depends on the global state of the GC:
>
> //----
>     auto s1 = new ubyte[](12_000);
>     auto bck = s1;
>     size_t cap = s1.capacity;
>     s1.reserve(13_000);
>     assert(s1 is bck); //Apparently, s1 is not changed.
>     assert(cap == s1.capacity); //This fails, yet capacity is pure?
> //----
>
> Again, is this OK?

When you understand `capacity` as being referenced by the arrays, it's ok. Because then, of course `s1 is bck` even though its capacity changed. And `cap == s1.capacity` fails, because `cap` is a copy of an earlier state.

>
> :::::::::::::::::::::::::::::::::::::::::
>
> One last question: Pointers.
>
> int get(int* p) pure
> {
>     return *p;
> }
>
> void main()
> {
>     int i = 0;
>     auto p = &i;
>     get(p);
> }
>
> Here, get, to me, is obviously not pure, since it depends on the state of the global "i". *Where* did "get" go wrong? Did I simply "abusively" mark get as pure? Is the "pure" keyword's guarantee simply "weak"?

By passing a pointer to `i`, you explicitly add it to the list of stuff, `get` may access (and alter).

Indeed, `pure` by itself only guarantees "weak purity". That means, the function may alter its arguments (and everything reachable from them). To get "strong purity", mark all arguments as const.

>
> :::::::::::::::::::::::::::::::::::::::::
>
> To sum up the question, regardless of how the *keyword* pure currently works, my question is "*How* should it behave"? We're currently marking things in Phobos as pure, and I'm worried the only reason we are doing it is that the keyword is lenient, and said functions *aren't* actually pure...

I think all cases you showed work as expected. For `pure` to really shine (strong purity), you need to add some `const` into the mix.

See also: http://dlang.org/function.html#pure-functions

August 03, 2013

Re: purity and memory allocations/pointers

Posted by Meta
in reply to Timon Gehr

Meta

Posted in reply to Timon Gehr

On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
> On 08/03/2013 05:59 PM, monarch_dodra wrote:
>> One last question: Pointers.
>>
>> int get(int* p) pure
>> {
>>     return *p;
>> }
>>
>> void main()
>> {
>>     int i = 0;
>>     auto p = &i;
>>     get(p);
>> }
>>
>> Here, get, to me, is obviously not pure, since it depends on the state
>> of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
>> mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
>> ...
>
> Yes, it's weak.

It depends on whether you think a pointer dereference is pure or not (I don't know the answer). That aside, as long as get doesn't modify the value at *p or change what p points to, this is strongly pure (i.e., the academic definition of purity).

August 03, 2013

Re: purity and memory allocations/pointers

Posted by monarch_dodra
in reply to Meta

monarch_dodra

Posted in reply to Meta

On Saturday, 3 August 2013 at 19:07:49 UTC, Meta wrote:
> On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
>> On 08/03/2013 05:59 PM, monarch_dodra wrote:
>>> One last question: Pointers.
>>>
>>> int get(int* p) pure
>>> {
>>>    return *p;
>>> }
>>>
>>> void main()
>>> {
>>>    int i = 0;
>>>    auto p = &i;
>>>    get(p);
>>> }
>>>
>>> Here, get, to me, is obviously not pure, since it depends on the state
>>> of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
>>> mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
>>> ...
>>
>> Yes, it's weak.
>
> It depends on whether you think a pointer dereference is pure or not (I don't know the answer). That aside, as long as get doesn't modify the value at *p or change what p points to, this is strongly pure (i.e., the academic definition of purity).

Thank the 3 of you for your answers. I think I had a wrong preconception of what pure is. I think this cleared most of it up.

August 03, 2013

Re: purity and memory allocations/pointers

Posted by Lars T. Kyllingstad
in reply to monarch_dodra

Lars T. Kyllingstad

Posted in reply to monarch_dodra

On Saturday, 3 August 2013 at 21:19:35 UTC, monarch_dodra wrote:

> Thank the 3 of you for your answers. I think I had a wrong preconception of what pure is. I think this cleared most of it up.

You may also find this discussion interesting, if you haven't already seen it:

http://forum.dlang.org/thread/i7bp8o$6po$1@digitalmars.com

It basically led to our current definition of "pure".

August 03, 2013

Re: purity and memory allocations/pointers

Posted by Timon Gehr
in reply to Meta

Timon Gehr

Posted in reply to Meta

On 08/03/2013 09:07 PM, Meta wrote:
> On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
>> On 08/03/2013 05:59 PM, monarch_dodra wrote:
>>> One last question: Pointers.
>>>
>>> int get(int* p) pure
>>> {
>>>     return *p;
>>> }
>>>
>>> void main()
>>> {
>>>     int i = 0;
>>>     auto p = &i;
>>>     get(p);
>>> }
>>>
>>> Here, get, to me, is obviously not pure, since it depends on the state
>>> of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
>>> mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
>>> ...
>>
>> Yes, it's weak.
>
> It depends on whether you think a pointer dereference is pure or not  (I
> don't know the answer).

It is pure in D, but I guess you are not referring to that.
What's your understanding of purity in this context?

> That aside, as long as get doesn't modify the
> value at *p or change what p points to, this is strongly pure

Modification and dereference within a Haskell expression:

import Data.STRef
import Control.Monad.ST

x = runST $ do
  x <- newSTRef 0
  writeSTRef x 1
  v <- readSTRef x
  return v

main = print x

> (i.e., the academic definition of purity).

I wouldn't go that far.

August 03, 2013

Re: purity and memory allocations/pointers

Posted by John Colvin
in reply to monarch_dodra

John Colvin

Posted in reply to monarch_dodra

On Saturday, 3 August 2013 at 21:19:35 UTC, monarch_dodra wrote:
> On Saturday, 3 August 2013 at 19:07:49 UTC, Meta wrote:
>> On Saturday, 3 August 2013 at 16:47:52 UTC, Timon Gehr wrote:
>>> On 08/03/2013 05:59 PM, monarch_dodra wrote:
>>>> One last question: Pointers.
>>>>
>>>> int get(int* p) pure
>>>> {
>>>>   return *p;
>>>> }
>>>>
>>>> void main()
>>>> {
>>>>   int i = 0;
>>>>   auto p = &i;
>>>>   get(p);
>>>> }
>>>>
>>>> Here, get, to me, is obviously not pure, since it depends on the state
>>>> of the global "i". *Where* did "get" go wrong? Did I simply "abusively"
>>>> mark get as pure? Is the "pure" keyword's guarantee simply "weak"?
>>>> ...
>>>
>>> Yes, it's weak.
>>
>> It depends on whether you think a pointer dereference is pure or not (I don't know the answer). That aside, as long as get doesn't modify the value at *p or change what p points to, this is strongly pure (i.e., the academic definition of purity).
>
> Thank the 3 of you for your answers. I think I had a wrong preconception of what pure is. I think this cleared most of it up.

Is there anywhere formal defining D's pure (weak vs strong etc.)? A page in the wiki perhaps?

Imagine someone new coming to D and being confused by what our purity system is. It would suck to only be able to give an ad-hoc answer or link them to a previous discussion.

I would offer but I don't really understand it myself.

August 04, 2013

Re: purity and memory allocations/pointers

Posted by Meta
in reply to Timon Gehr

Meta

Posted in reply to Timon Gehr

On Saturday, 3 August 2013 at 23:04:14 UTC, Timon Gehr wrote:
>> It depends on whether you think a pointer dereference is pure or not  (I don't know the answer).
>
> It is pure in D, but I guess you are not referring to that.
> What's your understanding of purity in this context?

I'm thinking of when a pointer points to an invalid location (0x00000000 or similar) and is dereferenced. I'm not sure if that would be considered impure or not, i.e., causes side effects.

>> That aside, as long as get doesn't modify the
>> value at *p or change what p points to, this is strongly pure
>
> Modification and dereference within a Haskell expression:
>
> import Data.STRef
> import Control.Monad.ST
>
> x = runST $ do
>   x <- newSTRef 0
>   writeSTRef x 1
>   v <- readSTRef x
>   return v
>
> main = print x

I apologize, as I don't know how familiar you are with Haskell, so forgive me if I'm telling you something you already know. That code is 100% pure and side-effect free; no variables are being modified. Haskell only simulates side-effects with monads, and do-notation (syntwhat you use in this example) which is syntactic sugar.

>> (i.e., the academic definition of purity).
>
> I wouldn't go that far.

Perhaps that may go too far, as academics love to obfuscate topics with a bunch of extraneous cruft, but the fact remains that purity means:

1. No modification of local or global state (side-effects)
2. No dependence on global mutable state.

August 04, 2013

Re: purity and memory allocations/pointers

Posted by Timon Gehr
in reply to Meta

Timon Gehr

Posted in reply to Meta

On 08/04/2013 02:50 AM, Meta wrote:
> On Saturday, 3 August 2013 at 23:04:14 UTC, Timon Gehr wrote:
>> ...
>>
>> Modification and dereference within a Haskell expression:
>>
>> import Data.STRef
>> import Control.Monad.ST
>>
>> x = runST $ do
>>   x <- newSTRef 0
>>   writeSTRef x 1
>>   v <- readSTRef x
>>   return v
>>
>> main = print x
>
> I apologize, as I don't know how familiar you are with Haskell,

With the language, quite intimately.

> so forgive me if I'm telling you something you already know. That code is
> 100% pure and side-effect free;

My point was basically that x is a pure expression of type Integer.

> no variables are being modified.

Which I didn't claim. A reference is dereferenced and the contents of the referenced slot are replaced. Then the reference is dereferenced again to read the modified value out.

> Haskell only simulates side-effects

What tells you that D does not 'simulate' mutable state?

> with monads,

What is done to main behind the scenes in order to execute the program is most definitely side-effecting.

> and do-notation

do notation is not central to my point.

> (syntwhat you
> use in this example) which is syntactic sugar.
>
>>> (i.e., the academic definition of purity).
>>
>> I wouldn't go that far.
>
> Perhaps that may go too far, as academics love to obfuscate topics with
> a bunch of extraneous cruft,

I take issue with this statement. I meant, I wouldn't assume 'the academic definition of purity' is a thing.

> but the fact remains that purity means:
>
> 1. No modification of local or global state (side-effects)
> 2. No dependence on global mutable state.

What does this statement quantify over? Eg, where is the abstraction boundary?

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation