View mode: basic / threaded / horizontal-split · Log in · Help
April 09, 2008
pure or not pure?
I realize that I don't fully understand all the rules of what D pure 
functions will be.

I think all agree on these rules:

- pure functions can only call pure functions
- allow access to invariant data
- allow mutable access to simple stack variables (variables that are all on 
the stack, no references)

In Andrei's accu-functional document, instead of the last rule, there is:

- allow local automatic mutable state

I'm not sure what this means :)  But here are some questions about what is 
pure and what is not pure:

1. Can pure functions use mutable heap data?  If so, what are the 
restrictions for this?

i.e.:
pure char[] f(int x, int y)
{
  char[] c = new char[y];
  c[] = (char)x;
}

pure char[] g(int x)
{
  char[] c = f(x, 5);
}

2. Can pure functions use stack references to objects that are partially 
invariant, partially mutable?

i.e.:
class C
{
  invariant int x;
  int y;
}

pure int f(C c) {return c.x} // allowed?

This of course, assumes that C can assign x through a constructor, not just 
through a static initializer (this is not implemented today).

as an example of a pure function that takes a pointer to mutable data, but 
doesn't use the data, i.e. this doesn't ever read or write heap data:

pure char *add(char * c, int n) { return c + n;}

3. If the answer to question 1 is 'yes', can you pass mutable heap data to 
pure functions?

pure char[] substr(char[] src, int beg, int end) { return src[beg..end];}

Would substr be allowed to be called from a pure function?
Would substr be allowed to be called from a non-pure function?
Will the compiler tag heap data somehow during execution of a pure function 
to determine whether it is unique or not?

I'm just trying to get a handle on what is and isn't considered pure, as 
everyone seems to have a different opinion...

-Steve
April 09, 2008
Re: pure or not pure?
On 09/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> I realize that I don't fully understand all the rules of what D pure
>  functions will be.
>
>  I think all agree on these rules:
>
>  - pure functions can only call pure functions
>  - allow access to invariant data
>  - allow mutable access to simple stack variables (variables that are all on
>  the stack, no references)
>
>  In Andrei's accu-functional document, instead of the last rule, there is:
>
>  - allow local automatic mutable state

I thought that was pretty clear. It means local variables are allowed
to be mutable. (I realise that "automatic" isn't a synonym for
"local", but in the examples in accu-functional, Andrei does seem to
use it that way).



>  But here are some questions about what is
>  pure and what is not pure:
>
>  1. Can pure functions use mutable heap data?

I don't see why not.


>  If so, what are the
>  restrictions for this?
>
>  i.e.:
>  pure char[] f(int x, int y)

I'm confident enough to say the return value will have to be
invariant. However, you that doesn't stop you from returning heap
data. You should be able to do

   pure string spaces(int x)
   {
       auto s = new char[x];
       s[] = ' ';
       return assumeUnique!(s);
   }

but you can't omit the assumeUnique!().



>  2. Can pure functions use stack references to objects that are partially
>  invariant, partially mutable?

My guess would be no. There's just no need for it.


>  i.e.:

By the way, that should be e.g., not i.e.


>  This of course, assumes that C can assign x through a constructor, not just
>  through a static initializer (this is not implemented today).

I don't think it will be possible to initialize invariant data in a
non-invariant constructor, and I think an invariant constructor can
only make a fully invariant class.

I think this is a /necessary/ restriction, because otherwise the
following could not be typechecked:

   class C
   {
       invariant int * p;

       this(int * q)
       {
           p = q; // Big no no!
       }
   }




>  as an example of a pure function that takes a pointer to mutable data, but
>  doesn't use the data, i.e. this doesn't ever read or write heap data:
>
>  pure char *add(char * c, int n) { return c + n;}

Oooh - now that's an interesting one. That one's got me foxed. My
instinct says it should be disallowed, but I can't quite put my finger
on why. I'm gonna have to think about that one some more.


>  3. If the answer to question 1 is 'yes', can you pass mutable heap data to
>  pure functions?
>
>  pure char[] substr(char[] src, int beg, int end) { return src[beg..end];}

Same answer.



>  Would substr be allowed to be called from a pure function?

If and only if substr() is itself a pure function.

>  Would substr be allowed to be called from a non-pure function?

Yes


>  Will the compiler tag heap data somehow during execution of a pure function
>  to determine whether it is unique or not?

My guess would be no. (But that's just a guess). I reckon you'll have
to worry about that yourself and use the assumeUnique!() template to
return the data.
April 09, 2008
Re: pure or not pure?
Steven Schveighoffer wrote:
> I realize that I don't fully understand all the rules of what D pure 
> functions will be.
> 
> I think all agree on these rules:
> 
> - pure functions can only call pure functions
> - allow access to invariant data
> - allow mutable access to simple stack variables (variables that are all on 
> the stack, no references)
> 
> In Andrei's accu-functional document, instead of the last rule, there is:
> 
> - allow local automatic mutable state

Still not pretending to know too much, I'd say that if the A/W goal is 
"pure functions as far as MP optimizations is needed" only, then:

A function might need to have local state. Take a function that counts 
something in a list. It'd need a local counter -- unless we want to 
force the programmer to write it as a recursive function. (Which I 
wholeheartedly hope not. I'd be happy with D being multiparadigm.)

Such a variable would be the "local automatic mutable state". Local, of 
course. Automatic, as destructed at end of scope. Mutable and state, of 
course. :-)

> I'm not sure what this means :)  But here are some questions about what is 
> pure and what is not pure:
> 
> 1. Can pure functions use mutable heap data?  If so, what are the 
> restrictions for this?
> 
> i.e.:
> pure char[] f(int x, int y)
> {
>    char[] c = new char[y];
>    c[] = (char)x;

return c; // I presume.

> }
> 
> pure char[] g(int x)
> {
>    char[] c = f(x, 5);
> }

f is pure. g is pure itself, and only uses f. And a new c is created 
each time you access the function, so it's free of other references to 
it (that could change it).

For example, function

    pure char[] concatenate(char[] a, char[] b)
    {
        // return new string consisting of a ~ b
    }

The return value is of course a reference to a new string. In a purely 
functional language returning a reference to a heap object might not be 
ok (I'm no pro on that.), but as D otherwise pretends strings to behave 
somewhat value like, I assume this simply has to be ok.

Of course, a and b would have to be immutable.

It's really a shame that A/W don't elaborate on these things. Now we are 
only getting increasingly unsure about stuff, and at the same time the 
combined brain-cell-hours we invest here might simply be a wild goose 
chase. And after all, most of this is just what they eventually 
*decide*. We could use some help and guidelines, or even opinions on 
these things.

And we could even be of help to them, since right now there seem to be a 
lot of very smart, motivated and able people here. Maybe actually more 
than ever before.
April 09, 2008
Re: pure or not pure?
"Janice Caron" wrote
> On 09/04/2008, Steven Schveighoffer wrote:
>> I realize that I don't fully understand all the rules of what D pure
>>  functions will be.
>>
>>  I think all agree on these rules:
>>
>>  - pure functions can only call pure functions
>>  - allow access to invariant data
>>  - allow mutable access to simple stack variables (variables that are all 
>> on
>>  the stack, no references)
>>
>>  In Andrei's accu-functional document, instead of the last rule, there 
>> is:
>>
>>  - allow local automatic mutable state
>
> I thought that was pretty clear. It means local variables are allowed
> to be mutable. (I realise that "automatic" isn't a synonym for
> "local", but in the examples in accu-functional, Andrei does seem to
> use it that way).

So 'local automatic' means 'local local'?

>>  But here are some questions about what is
>>  pure and what is not pure:
>>
>>  1. Can pure functions use mutable heap data?
>
> I don't see why not.
>
>
>>  If so, what are the
>>  restrictions for this?
>>
>>  i.e.:
>>  pure char[] f(int x, int y)
>
> I'm confident enough to say the return value will have to be
> invariant. However, you that doesn't stop you from returning heap
> data.  You should be able to do
>
>    pure string spaces(int x)
>    {
>        auto s = new char[x];
>        s[] = ' ';
>        return assumeUnique!(s);
>    }
>
> but you can't omit the assumeUnique!().

So how is new char[] a pure function?  And why can't I 'wrap' new?  For 
instance, if I have a function:

int[] createIncreasingSequence(int s, int e, int step){...}

which creates an array of ints that start at s and end at e, increasing by 
step each time, why can't I make that pure?  i.e. I want to specify my own 
way to initialize an array.  If I do that in 10 different pure functions, 
you're saying I have to re-implement that code in each one?

>
>
>
>>  2. Can pure functions use stack references to objects that are partially
>>  invariant, partially mutable?
>
> My guess would be no. There's just no need for it.
>

Surely there is some possible need for having run-time initialized invariant 
variables...

>>  This of course, assumes that C can assign x through a constructor, not 
>> just
>>  through a static initializer (this is not implemented today).
>
> I don't think it will be possible to initialize invariant data in a
> non-invariant constructor, and I think an invariant constructor can
> only make a fully invariant class.
>
> I think this is a /necessary/ restriction, because otherwise the
> following could not be typechecked:
>
>    class C
>    {
>        invariant int * p;
>
>        this(int * q)
>        {
>            p = q; // Big no no!
>        }
>    }

Why would that compile?  If it did, then so would this:

this(int *q) invariant
{
   p = q;
}

I think invariant members that are run-time initialized can be done, as 
during the constructor, nothing has access to the 'this' pointer.

>>  3. If the answer to question 1 is 'yes', can you pass mutable heap data 
>> to
>>  pure functions?
>>
>>  pure char[] substr(char[] src, int beg, int end) { return 
>> src[beg..end];}
>
> Same answer.
>
>
>
>>  Would substr be allowed to be called from a pure function?
>
> If and only if substr() is itself a pure function.
>
>>  Would substr be allowed to be called from a non-pure function?
>
> Yes

This was misleading on my part.  I should have asked instead of these two 
questions, if substr is allowed to be pure, can it be called from a non-pure 
function?  In the context of being called from a pure function, src is 
unique, because pure has to have unique data.  But in the context of calling 
from a non-pure function, it might not be...

-Steve
April 09, 2008
Re: pure or not pure?
"Georg Wrede" wrote
> Steven Schveighoffer wrote:
>> pure char[] f(int x, int y)
>> {
>>    char[] c = new char[y];
>>    c[] = (char)x;
>
> return c; // I presume.
>
>> }

Yes, I forgot that statement, thanks :)  For some reason, Outlook Express 
didn't complain that this was an error :P

> It's really a shame that A/W don't elaborate on these things. Now we are 
> only getting increasingly unsure about stuff, and at the same time the 
> combined brain-cell-hours we invest here might simply be a wild goose 
> chase. And after all, most of this is just what they eventually *decide*. 
> We could use some help and guidelines, or even opinions on these things.
>
> And we could even be of help to them, since right now there seem to be a 
> lot of very smart, motivated and able people here. Maybe actually more 
> than ever before.

This is exactly why I asked these questions :)

-Steve
April 09, 2008
Re: pure or not pure?
Steven Schveighoffer wrote:
> "Janice Caron" wrote
> 
> So how is new char[] a pure function?  And why can't I 'wrap' new?  For 
> instance, if I have a function:
> 
> int[] createIncreasingSequence(int s, int e, int step){...}
> 
> which creates an array of ints that start at s and end at e, increasing by 
> step each time, why can't I make that pure?  i.e. I want to specify my own 
> way to initialize an array.  If I do that in 10 different pure functions, 
> you're saying I have to re-implement that code in each one?

The return value must be immutable if you're going to use it several times.

And if you want to cache the return values outside of the function (as 
in only returning a new array when new parameters are used, and 
otherwise just a reference to a previously returned one), then I guess I 
don't have an opinion. :-?

> This was misleading on my part.  I should have asked instead of these two 
> questions, if substr is allowed to be pure, can it be called from a non-pure 
> function?  In the context of being called from a pure function, src is 
> unique, because pure has to have unique data.  But in the context of calling 
> from a non-pure function, it might not be...

Ultimately, any function is called from main(). Which is a non-pure 
function.
April 09, 2008
Re: pure or not pure?
Janice Caron Wrote:

> On 09/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> 
> >  as an example of a pure function that takes a pointer to mutable data, but
> >  doesn't use the data, i.e. this doesn't ever read or write heap data:
> >
> >  pure char *add(char * c, int n) { return c + n;}
> 
> Oooh - now that's an interesting one. That one's got me foxed. My
> instinct says it should be disallowed, but I can't quite put my finger
> on why. I'm gonna have to think about that one some more.
> 

As long as you stick to pointer arithmetic and don't dereference c, it is pure.

Will it be allowed, that's pure speculation ;)
April 09, 2008
Re: pure or not pure?
On 09/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> So how is new char[] a pure function?

new is special.

>  And why can't I 'wrap' new?  For
>  instance, if I have a function:
>
>  int[] createIncreasingSequence(int s, int e, int step){...}
>
>  which creates an array of ints that start at s and end at e, increasing by
>  step each time, why can't I make that pure?

I reckon you can, but you got the return type wrong.

   invariant(int)[] createIncreasingSequence(int s, int e, int step){...}
   {
       auto array = new int[(e-s)/step];
       foreach(ref a;array)
       {
           a = e;
           e += step;
       }
       return assumeUnique!(array);
   }


> i.e. I want to specify my own
>  way to initialize an array.  If I do that in 10 different pure functions,
>  you're saying I have to re-implement that code in each one?

To be honest, I don't understand the question.

The rule is : same input => same output

What happens in the middle, so long as it is side-effect free, doesn't matter.


> Surely there is some possible need for having run-time initialized invariant
>  variables...

You can do that. Walter confirmed that in response to a previous example I gave.

  import std.date;

  void main()
  {
      invariant s = toString(getUTCtime());

      pure string f()
      {
          return s;
      }

      writefln(s);
  }

Walter says f is pure. Ironically, the program will give a different
output each time it's run, but aparently purity only has to be global,
not universal.



>  > I think this is a /necessary/ restriction, because otherwise the
>  > following could not be typechecked:
>  >
>  >    class C
>  >    {
>  >        invariant int * p;
>  >
>  >        this(int * q)
>  >        {
>  >            p = q; // Big no no!
>  >        }
>  >    }
>
>
> Why would that compile?

Thankfully, it doesn't. But you were suggesting it should be possible
to initialize member variables which were declared invariant inside a
constructor, so I was demonstrating why this would not be a good idea.


>  If it did, then so would this:
>
>  this(int *q) invariant
>  {
>     p = q;
>  }

Not so. Invariant constructors have different typechecking rules.

Member variable p initially has type __raw(int *), which means it can
only be assigned with an invariant(int *), not with an int *.



>  I think invariant members that are run-time initialized can be done, as
>  during the constructor, nothing has access to the 'this' pointer.

"this" is certainly available during a constructor.


>  This was misleading on my part.  I should have asked instead of these two
>  questions, if substr is allowed to be pure,

which is a big "if", and I'm still hedging my bets on the answer being
no. But let's hypothetically assume it's yes for sake of argument.


> can it be called from a non-pure
> function?

There is nothing to stop you calling a pure function from a non-pure
function. So that would be a yes.


> In the context of being called from a pure function, src is
> unique, because pure has to have unique data.

No it doesn't. It has to have invariant data, not unique data. The
whole point about invariant data is it's OK to share it.
April 10, 2008
Re: pure or not pure?
"Georg Wrede" wrote
> Steven Schveighoffer wrote:
>> "Janice Caron" wrote
>>
>> So how is new char[] a pure function?  And why can't I 'wrap' new?  For 
>> instance, if I have a function:
>>
>> int[] createIncreasingSequence(int s, int e, int step){...}
>>
>> which creates an array of ints that start at s and end at e, increasing 
>> by step each time, why can't I make that pure?  i.e. I want to specify my 
>> own way to initialize an array.  If I do that in 10 different pure 
>> functions, you're saying I have to re-implement that code in each one?
>
> The return value must be immutable if you're going to use it several 
> times.

Agreed.  But I don't want to use it several times, I want to use it once :)

I want it to act like new does.  For example, if I have

char[] c = new char[5];

in two different pure functions, new better not return the same exact memory 
space for both!  It should be called every time.  But I can't do this with a 
custom function because pure functions can only call other pure functions. 
Basically, I'm asking if there will be a way to move a common piece of a 
function that allocates and initializes data out of a pure function and into 
another.

It would be cool if the compiler did not perform special optimizations (such 
as memoization) on pure functions that return mutable heap data, but still 
enforced the rules of purity.

> And if you want to cache the return values outside of the function (as in 
> only returning a new array when new parameters are used, and otherwise 
> just a reference to a previously returned one), then I guess I don't have 
> an opinion. :-?
>
>> This was misleading on my part.  I should have asked instead of these two 
>> questions, if substr is allowed to be pure, can it be called from a 
>> non-pure function?  In the context of being called from a pure function, 
>> src is unique, because pure has to have unique data.  But in the context 
>> of calling from a non-pure function, it might not be...
>
> Ultimately, any function is called from main(). Which is a non-pure 
> function.

This is a special case of pure function.  Let's say I have a pure function:

pure invariant(char)[] f()
{
   char[] c = new char[15];
   g(c);
   return assumeUnique!(c);
}

Now, if g is a pure function that takes a char[] argument and accesses the 
data referenced by the argument, it could be OK to call it from another pure 
function because we could assume that the pure function can only use heap 
data that is 'local' to the function.  But it might be invalid to call such 
a pure function from a non-pure function because data held by a non-pure 
function is not always considered 'local'.  On the other hand, if a pure 
function can hold pointers to heap data that is shared, but just can't use 
it, then a pure function that takes mutable heap data might not be able to 
use it, so a function like g() isn't possible.

If pure functions only allow invariant heap parameters and invariant heap 
returns, then this whole discussion becomes moot, but if they accept 
non-invariant heap parameters, or allow non-invariant heap returns, I want 
to know what the rules for those are.  It seems to me that only allowing 
invariant heap data is a severe limitation that will leave D functional 
programming crippled in the face of an actual FP language.

-Steve
April 10, 2008
Re: pure or not pure?
On 10/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>  I want it to act like new does.  For example, if I have
>
>  char[] c = new char[5];
>
>  in two different pure functions, new better not return the same exact memory
>  space for both!

You're overcomplicating this. The rule is

   same input => same output

And I would put good money on the notion that "same" means "compares
as equal with the == operator".

(And that almost certainly means that mutable char[]s won't be allowed
as parameters to pure functions - because strings and char[]s with the
same content compare as equal)
« First   ‹ Prev
1 2 3 4
Top | Discussion index | About this forum | D home