pure or not pure? - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » General » pure or not pure?

Thread overview

pure or not pure?
Apr 09, 2008 Steven Schveighoffer
Apr 09, 2008 Janice Caron
Apr 09, 2008 Steven Schveighoffer
Apr 09, 2008 Georg Wrede
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Janice Caron
Apr 10, 2008 Georg Wrede
Apr 09, 2008 Janice Caron
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Janice Caron
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Janice Caron
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Janice Caron
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Janice Caron
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Janice Caron
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Janice Caron
Apr 10, 2008 Koroskin Denis
Apr 10, 2008 Janice Caron
Apr 11, 2008 Koroskin Denis
Apr 14, 2008 Janice Caron
Apr 10, 2008 Janice Caron
Apr 10, 2008 Georg Wrede
Apr 10, 2008 Steven Schveighoffer
Apr 10, 2008 Georg Wrede
Apr 10, 2008 Georg Wrede
Apr 10, 2008 Leandro Lucarella
Apr 10, 2008 Georg Wrede
Apr 10, 2008 Russell Lewis
Apr 10, 2008 Georg Wrede
Apr 11, 2008 Koroskin Denis
Apr 11, 2008 Steven Schveighoffer
Apr 09, 2008 guslay
Apr 09, 2008 Georg Wrede
Apr 09, 2008 Steven Schveighoffer

April 09, 2008

pure or not pure?

Posted by Steven Schveighoffer

Steven Schveighoffer

I realize that I don't fully understand all the rules of what D pure functions will be.

I think all agree on these rules:

- pure functions can only call pure functions
- allow access to invariant data
- allow mutable access to simple stack variables (variables that are all on
the stack, no references)

In Andrei's accu-functional document, instead of the last rule, there is:

- allow local automatic mutable state

I'm not sure what this means :)  But here are some questions about what is pure and what is not pure:

1. Can pure functions use mutable heap data?  If so, what are the restrictions for this?

i.e.:
pure char[] f(int x, int y)
{
   char[] c = new char[y];
   c[] = (char)x;
}

pure char[] g(int x)
{
   char[] c = f(x, 5);
}

2. Can pure functions use stack references to objects that are partially invariant, partially mutable?

i.e.:
class C
{
   invariant int x;
   int y;
}

pure int f(C c) {return c.x} // allowed?

This of course, assumes that C can assign x through a constructor, not just through a static initializer (this is not implemented today).

as an example of a pure function that takes a pointer to mutable data, but doesn't use the data, i.e. this doesn't ever read or write heap data:

pure char *add(char * c, int n) { return c + n;}

3. If the answer to question 1 is 'yes', can you pass mutable heap data to pure functions?

pure char[] substr(char[] src, int beg, int end) { return src[beg..end];}

Would substr be allowed to be called from a pure function?
Would substr be allowed to be called from a non-pure function?
Will the compiler tag heap data somehow during execution of a pure function
to determine whether it is unique or not?

I'm just trying to get a handle on what is and isn't considered pure, as everyone seems to have a different opinion...

-Steve

April 09, 2008

Re: pure or not pure?

Posted by Janice Caron
in reply to Steven Schveighoffer

Janice Caron

Posted in reply to Steven Schveighoffer

On 09/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> I realize that I don't fully understand all the rules of what D pure
>  functions will be.
>
>  I think all agree on these rules:
>
>  - pure functions can only call pure functions
>  - allow access to invariant data
>  - allow mutable access to simple stack variables (variables that are all on
>  the stack, no references)
>
>  In Andrei's accu-functional document, instead of the last rule, there is:
>
>  - allow local automatic mutable state

I thought that was pretty clear. It means local variables are allowed to be mutable. (I realise that "automatic" isn't a synonym for "local", but in the examples in accu-functional, Andrei does seem to use it that way).

>  But here are some questions about what is
>  pure and what is not pure:
>
>  1. Can pure functions use mutable heap data?

I don't see why not.

>  If so, what are the
>  restrictions for this?
>
>  i.e.:
>  pure char[] f(int x, int y)

I'm confident enough to say the return value will have to be invariant. However, you that doesn't stop you from returning heap data. You should be able to do

    pure string spaces(int x)
    {
        auto s = new char[x];
        s[] = ' ';
        return assumeUnique!(s);
    }

but you can't omit the assumeUnique!().

>  2. Can pure functions use stack references to objects that are partially
>  invariant, partially mutable?

My guess would be no. There's just no need for it.

>  i.e.:

By the way, that should be e.g., not i.e.

>  This of course, assumes that C can assign x through a constructor, not just
>  through a static initializer (this is not implemented today).

I don't think it will be possible to initialize invariant data in a non-invariant constructor, and I think an invariant constructor can only make a fully invariant class.

I think this is a /necessary/ restriction, because otherwise the following could not be typechecked:

    class C
    {
        invariant int * p;

        this(int * q)
        {
            p = q; // Big no no!
        }
    }

>  as an example of a pure function that takes a pointer to mutable data, but
>  doesn't use the data, i.e. this doesn't ever read or write heap data:
>
>  pure char *add(char * c, int n) { return c + n;}

Oooh - now that's an interesting one. That one's got me foxed. My instinct says it should be disallowed, but I can't quite put my finger on why. I'm gonna have to think about that one some more.

>  3. If the answer to question 1 is 'yes', can you pass mutable heap data to
>  pure functions?
>
>  pure char[] substr(char[] src, int beg, int end) { return src[beg..end];}

Same answer.

>  Would substr be allowed to be called from a pure function?

If and only if substr() is itself a pure function.

>  Would substr be allowed to be called from a non-pure function?

Yes

>  Will the compiler tag heap data somehow during execution of a pure function
>  to determine whether it is unique or not?

My guess would be no. (But that's just a guess). I reckon you'll have
to worry about that yourself and use the assumeUnique!() template to
return the data.

April 09, 2008

Re: pure or not pure?

Posted by Georg Wrede
in reply to Steven Schveighoffer

Georg Wrede

Posted in reply to Steven Schveighoffer

Steven Schveighoffer wrote:
> I realize that I don't fully understand all the rules of what D pure functions will be.
> 
> I think all agree on these rules:
> 
> - pure functions can only call pure functions
> - allow access to invariant data
> - allow mutable access to simple stack variables (variables that are all on the stack, no references)
> 
> In Andrei's accu-functional document, instead of the last rule, there is:
> 
> - allow local automatic mutable state

Still not pretending to know too much, I'd say that if the A/W goal is "pure functions as far as MP optimizations is needed" only, then:

A function might need to have local state. Take a function that counts something in a list. It'd need a local counter -- unless we want to force the programmer to write it as a recursive function. (Which I wholeheartedly hope not. I'd be happy with D being multiparadigm.)

Such a variable would be the "local automatic mutable state". Local, of course. Automatic, as destructed at end of scope. Mutable and state, of course. :-)

> I'm not sure what this means :)  But here are some questions about what is pure and what is not pure:
> 
> 1. Can pure functions use mutable heap data?  If so, what are the restrictions for this?
> 
> i.e.:
> pure char[] f(int x, int y)
> {
>    char[] c = new char[y];
>    c[] = (char)x;

return c; // I presume.

> }
> 
> pure char[] g(int x)
> {
>    char[] c = f(x, 5);
> }

f is pure. g is pure itself, and only uses f. And a new c is created each time you access the function, so it's free of other references to it (that could change it).

For example, function

    pure char[] concatenate(char[] a, char[] b)
    {
        // return new string consisting of a ~ b
    }

The return value is of course a reference to a new string. In a purely functional language returning a reference to a heap object might not be ok (I'm no pro on that.), but as D otherwise pretends strings to behave somewhat value like, I assume this simply has to be ok.

Of course, a and b would have to be immutable.

It's really a shame that A/W don't elaborate on these things. Now we are only getting increasingly unsure about stuff, and at the same time the combined brain-cell-hours we invest here might simply be a wild goose chase. And after all, most of this is just what they eventually *decide*. We could use some help and guidelines, or even opinions on these things.

And we could even be of help to them, since right now there seem to be a lot of very smart, motivated and able people here. Maybe actually more than ever before.

April 09, 2008

Re: pure or not pure?

Posted by Steven Schveighoffer
in reply to Janice Caron

Steven Schveighoffer

Posted in reply to Janice Caron

"Janice Caron" wrote
> On 09/04/2008, Steven Schveighoffer wrote:
>> I realize that I don't fully understand all the rules of what D pure
>>  functions will be.
>>
>>  I think all agree on these rules:
>>
>>  - pure functions can only call pure functions
>>  - allow access to invariant data
>>  - allow mutable access to simple stack variables (variables that are all
>> on
>>  the stack, no references)
>>
>>  In Andrei's accu-functional document, instead of the last rule, there
>> is:
>>
>>  - allow local automatic mutable state
>
> I thought that was pretty clear. It means local variables are allowed to be mutable. (I realise that "automatic" isn't a synonym for "local", but in the examples in accu-functional, Andrei does seem to use it that way).

So 'local automatic' means 'local local'?

>>  But here are some questions about what is
>>  pure and what is not pure:
>>
>>  1. Can pure functions use mutable heap data?
>
> I don't see why not.
>
>
>>  If so, what are the
>>  restrictions for this?
>>
>>  i.e.:
>>  pure char[] f(int x, int y)
>
> I'm confident enough to say the return value will have to be invariant. However, you that doesn't stop you from returning heap data.  You should be able to do
>
>    pure string spaces(int x)
>    {
>        auto s = new char[x];
>        s[] = ' ';
>        return assumeUnique!(s);
>    }
>
> but you can't omit the assumeUnique!().

So how is new char[] a pure function?  And why can't I 'wrap' new?  For instance, if I have a function:

int[] createIncreasingSequence(int s, int e, int step){...}

which creates an array of ints that start at s and end at e, increasing by step each time, why can't I make that pure?  i.e. I want to specify my own way to initialize an array.  If I do that in 10 different pure functions, you're saying I have to re-implement that code in each one?

>
>
>
>>  2. Can pure functions use stack references to objects that are partially
>>  invariant, partially mutable?
>
> My guess would be no. There's just no need for it.
>

Surely there is some possible need for having run-time initialized invariant variables...

>>  This of course, assumes that C can assign x through a constructor, not
>> just
>>  through a static initializer (this is not implemented today).
>
> I don't think it will be possible to initialize invariant data in a non-invariant constructor, and I think an invariant constructor can only make a fully invariant class.
>
> I think this is a /necessary/ restriction, because otherwise the following could not be typechecked:
>
>    class C
>    {
>        invariant int * p;
>
>        this(int * q)
>        {
>            p = q; // Big no no!
>        }
>    }

Why would that compile?  If it did, then so would this:

this(int *q) invariant
{
    p = q;
}

I think invariant members that are run-time initialized can be done, as during the constructor, nothing has access to the 'this' pointer.

>>  3. If the answer to question 1 is 'yes', can you pass mutable heap data
>> to
>>  pure functions?
>>
>>  pure char[] substr(char[] src, int beg, int end) { return
>> src[beg..end];}
>
> Same answer.
>
>
>
>>  Would substr be allowed to be called from a pure function?
>
> If and only if substr() is itself a pure function.
>
>>  Would substr be allowed to be called from a non-pure function?
>
> Yes

This was misleading on my part.  I should have asked instead of these two questions, if substr is allowed to be pure, can it be called from a non-pure function?  In the context of being called from a pure function, src is unique, because pure has to have unique data.  But in the context of calling from a non-pure function, it might not be...

-Steve

April 09, 2008

Re: pure or not pure?

Posted by Steven Schveighoffer
in reply to Georg Wrede

Steven Schveighoffer

Posted in reply to Georg Wrede

"Georg Wrede" wrote
> Steven Schveighoffer wrote:
>> pure char[] f(int x, int y)
>> {
>>    char[] c = new char[y];
>>    c[] = (char)x;
>
> return c; // I presume.
>
>> }

Yes, I forgot that statement, thanks :)  For some reason, Outlook Express didn't complain that this was an error :P

> It's really a shame that A/W don't elaborate on these things. Now we are only getting increasingly unsure about stuff, and at the same time the combined brain-cell-hours we invest here might simply be a wild goose chase. And after all, most of this is just what they eventually *decide*. We could use some help and guidelines, or even opinions on these things.
>
> And we could even be of help to them, since right now there seem to be a lot of very smart, motivated and able people here. Maybe actually more than ever before.

This is exactly why I asked these questions :)

-Steve

April 09, 2008

Re: pure or not pure?

Posted by Georg Wrede
in reply to Steven Schveighoffer

Georg Wrede

Posted in reply to Steven Schveighoffer

Steven Schveighoffer wrote:
> "Janice Caron" wrote
> 
> So how is new char[] a pure function?  And why can't I 'wrap' new?  For instance, if I have a function:
> 
> int[] createIncreasingSequence(int s, int e, int step){...}
> 
> which creates an array of ints that start at s and end at e, increasing by step each time, why can't I make that pure?  i.e. I want to specify my own way to initialize an array.  If I do that in 10 different pure functions, you're saying I have to re-implement that code in each one?

The return value must be immutable if you're going to use it several times.

And if you want to cache the return values outside of the function (as in only returning a new array when new parameters are used, and otherwise just a reference to a previously returned one), then I guess I don't have an opinion. :-?

> This was misleading on my part.  I should have asked instead of these two questions, if substr is allowed to be pure, can it be called from a non-pure function?  In the context of being called from a pure function, src is unique, because pure has to have unique data.  But in the context of calling from a non-pure function, it might not be...

Ultimately, any function is called from main(). Which is a non-pure function.

April 09, 2008

Re: pure or not pure?

Posted by guslay
in reply to Janice Caron

guslay

Posted in reply to Janice Caron

Janice Caron Wrote:

> On 09/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> 
> >  as an example of a pure function that takes a pointer to mutable data, but
> >  doesn't use the data, i.e. this doesn't ever read or write heap data:
> >
> >  pure char *add(char * c, int n) { return c + n;}
> 
> Oooh - now that's an interesting one. That one's got me foxed. My instinct says it should be disallowed, but I can't quite put my finger on why. I'm gonna have to think about that one some more.
> 

As long as you stick to pointer arithmetic and don't dereference c, it is pure.

Will it be allowed, that's pure speculation ;)

April 09, 2008

Re: pure or not pure?

Posted by Janice Caron
in reply to Steven Schveighoffer

Janice Caron

Posted in reply to Steven Schveighoffer

On 09/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> So how is new char[] a pure function?

new is special.

>  And why can't I 'wrap' new?  For
>  instance, if I have a function:
>
>  int[] createIncreasingSequence(int s, int e, int step){...}
>
>  which creates an array of ints that start at s and end at e, increasing by
>  step each time, why can't I make that pure?

I reckon you can, but you got the return type wrong.

    invariant(int)[] createIncreasingSequence(int s, int e, int step){...}
    {
        auto array = new int[(e-s)/step];
        foreach(ref a;array)
        {
            a = e;
            e += step;
        }
        return assumeUnique!(array);
    }

> i.e. I want to specify my own
>  way to initialize an array.  If I do that in 10 different pure functions,
>  you're saying I have to re-implement that code in each one?

To be honest, I don't understand the question.

The rule is : same input => same output

What happens in the middle, so long as it is side-effect free, doesn't matter.

> Surely there is some possible need for having run-time initialized invariant
>  variables...

You can do that. Walter confirmed that in response to a previous example I gave.

   import std.date;

   void main()
   {
       invariant s = toString(getUTCtime());

       pure string f()
       {
           return s;
       }

       writefln(s);
   }

Walter says f is pure. Ironically, the program will give a different output each time it's run, but aparently purity only has to be global, not universal.

>  > I think this is a /necessary/ restriction, because otherwise the
>  > following could not be typechecked:
>  >
>  >    class C
>  >    {
>  >        invariant int * p;
>  >
>  >        this(int * q)
>  >        {
>  >            p = q; // Big no no!
>  >        }
>  >    }
>
>
> Why would that compile?

Thankfully, it doesn't. But you were suggesting it should be possible to initialize member variables which were declared invariant inside a constructor, so I was demonstrating why this would not be a good idea.

>  If it did, then so would this:
>
>  this(int *q) invariant
>  {
>     p = q;
>  }

Not so. Invariant constructors have different typechecking rules.

Member variable p initially has type __raw(int *), which means it can
only be assigned with an invariant(int *), not with an int *.

>  I think invariant members that are run-time initialized can be done, as
>  during the constructor, nothing has access to the 'this' pointer.

"this" is certainly available during a constructor.

>  This was misleading on my part.  I should have asked instead of these two
>  questions, if substr is allowed to be pure,

which is a big "if", and I'm still hedging my bets on the answer being no. But let's hypothetically assume it's yes for sake of argument.

> can it be called from a non-pure
> function?

There is nothing to stop you calling a pure function from a non-pure function. So that would be a yes.

> In the context of being called from a pure function, src is unique, because pure has to have unique data.

No it doesn't. It has to have invariant data, not unique data. The whole point about invariant data is it's OK to share it.

April 10, 2008

Re: pure or not pure?

Posted by Steven Schveighoffer
in reply to Georg Wrede

Steven Schveighoffer

Posted in reply to Georg Wrede

"Georg Wrede" wrote
> Steven Schveighoffer wrote:
>> "Janice Caron" wrote
>>
>> So how is new char[] a pure function?  And why can't I 'wrap' new?  For instance, if I have a function:
>>
>> int[] createIncreasingSequence(int s, int e, int step){...}
>>
>> which creates an array of ints that start at s and end at e, increasing by step each time, why can't I make that pure?  i.e. I want to specify my own way to initialize an array.  If I do that in 10 different pure functions, you're saying I have to re-implement that code in each one?
>
> The return value must be immutable if you're going to use it several times.

Agreed.  But I don't want to use it several times, I want to use it once :)

I want it to act like new does.  For example, if I have

char[] c = new char[5];

in two different pure functions, new better not return the same exact memory space for both!  It should be called every time.  But I can't do this with a custom function because pure functions can only call other pure functions. Basically, I'm asking if there will be a way to move a common piece of a function that allocates and initializes data out of a pure function and into another.

It would be cool if the compiler did not perform special optimizations (such as memoization) on pure functions that return mutable heap data, but still enforced the rules of purity.

> And if you want to cache the return values outside of the function (as in only returning a new array when new parameters are used, and otherwise just a reference to a previously returned one), then I guess I don't have an opinion. :-?
>
>> This was misleading on my part.  I should have asked instead of these two questions, if substr is allowed to be pure, can it be called from a non-pure function?  In the context of being called from a pure function, src is unique, because pure has to have unique data.  But in the context of calling from a non-pure function, it might not be...
>
> Ultimately, any function is called from main(). Which is a non-pure function.

This is a special case of pure function.  Let's say I have a pure function:

pure invariant(char)[] f()
{
    char[] c = new char[15];
    g(c);
    return assumeUnique!(c);
}

Now, if g is a pure function that takes a char[] argument and accesses the data referenced by the argument, it could be OK to call it from another pure function because we could assume that the pure function can only use heap data that is 'local' to the function.  But it might be invalid to call such a pure function from a non-pure function because data held by a non-pure function is not always considered 'local'.  On the other hand, if a pure function can hold pointers to heap data that is shared, but just can't use it, then a pure function that takes mutable heap data might not be able to use it, so a function like g() isn't possible.

If pure functions only allow invariant heap parameters and invariant heap returns, then this whole discussion becomes moot, but if they accept non-invariant heap parameters, or allow non-invariant heap returns, I want to know what the rules for those are.  It seems to me that only allowing invariant heap data is a severe limitation that will leave D functional programming crippled in the face of an actual FP language.

-Steve

April 10, 2008

Re: pure or not pure?

Posted by Janice Caron
in reply to Steven Schveighoffer

Janice Caron

Posted in reply to Steven Schveighoffer

On 10/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>  I want it to act like new does.  For example, if I have
>
>  char[] c = new char[5];
>
>  in two different pure functions, new better not return the same exact memory
>  space for both!

You're overcomplicating this. The rule is

    same input => same output

And I would put good money on the notion that "same" means "compares as equal with the == operator".

(And that almost certainly means that mutable char[]s won't be allowed as parameters to pure functions - because strings and char[]s with the same content compare as equal)

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation