April 10, 2008
"Janice Caron" wrote
> On 09/04/2008, Steven Schveighoffer wrote:
>> So how is new char[] a pure function?
>
> new is special.
>
>>  And why can't I 'wrap' new?  For
>>  instance, if I have a function:
>>
>>  int[] createIncreasingSequence(int s, int e, int step){...}
>>
>>  which creates an array of ints that start at s and end at e, increasing
>> by
>>  step each time, why can't I make that pure?
>
> I reckon you can, but you got the return type wrong.
>
>    invariant(int)[] createIncreasingSequence(int s, int e, int step){...}
>    {
>        auto array = new int[(e-s)/step];
>        foreach(ref a;array)
>        {
>            a = e;
>            e += step;
>        }
>        return assumeUnique!(array);
>    }

This doesn't work for what I'm asking.  I'll add more detail to the example. You have two functions f, and g:

pure int f()
{
    int[] x = new int[5];
    for(int i = 0; i < x.length; i++)
       x[i] = 1 + i;

    /* do stuff */
    return result;
}

pure int g()
{
    int[] y = new int[15];
    for(int i = 0; i < y.length; y++)
        y[i] = 10 + (i * 3);

    /* do stuff */
    return result;
}

Now, I realize I have very similar code that initializes the int arrays, so I want to put that into a function that creates and initializes an array with the given parameters:

int[] createIncreasingSequence(int length, int start, int step)
{
    int[] result = new int[length];
    for(int i = 0; i < result.length; i++)
        result[i] = start + (i * step);
    return result;
}

Now, I think we all agree that createIncreasingSequence is as pure as calling new, right?  That is, it doesn't affect any global state (except for allocating heap data) and will return an equivalent array every time the same arguments are given.  But if I can't declare it as 'pure', then I can't call it from f and g:

pure int f()
{
   int[] x = createIncreasingSequence(5, 1, 1);

   /* do stuff */
   return result;
}

pure int g()
{
   int[] y = createIncreasingSequence(15, 10, 3);

   /* do stuff */
   return result;
}

So should there be a way to define createIncreasingSequence such that I can call it from a pure function?  Or do I have to re-implement it in every pure function I use?

Not having the ability to pass around mutable heap data to and from pure functions is going to limit severely the usefulness of pure functions.

>> Surely there is some possible need for having run-time initialized
>> invariant
>>  variables...
>
> You can do that. Walter confirmed that in response to a previous example I gave.
>
>   import std.date;
>
>   void main()
>   {
>       invariant s = toString(getUTCtime());
>
>       pure string f()
>       {
>           return s;
>       }
>
>       writefln(s);
>   }
>
> Walter says f is pure. Ironically, the program will give a different output each time it's run, but aparently purity only has to be global, not universal.

This is exactly what I want to do, But I want to do it inside a class scope, not in a function scope.  I think someone in another thread already confirmed that this is possible in DMD 2.012, so my question is already answered.

>>  > I think this is a /necessary/ restriction, because otherwise the
>>  > following could not be typechecked:
>>  >
>>  >    class C
>>  >    {
>>  >        invariant int * p;
>>  >
>>  >        this(int * q)
>>  >        {
>>  >            p = q; // Big no no!
>>  >        }
>>  >    }
>>
>>
>> Why would that compile?
>
> Thankfully, it doesn't. But you were suggesting it should be possible to initialize member variables which were declared invariant inside a constructor, so I was demonstrating why this would not be a good idea.

Like I said, this is already possible.

>>  If it did, then so would this:
>>
>>  this(int *q) invariant
>>  {
>>     p = q;
>>  }
>
> Not so. Invariant constructors have different typechecking rules.
>
> Member variable p initially has type __raw(int *), which means it can
> only be assigned with an invariant(int *), not with an int *.

I would guess that this is a requirement for invariant class members as well.

>>  I think invariant members that are run-time initialized can be done, as
>>  during the constructor, nothing has access to the 'this' pointer.
>
> "this" is certainly available during a constructor.

I would guess that you cannot pass 'this' to an external entity before initializing the invariant variables (or before initializing mutable variables in an invariant constructor).  This should be statically verifiable.

-Steve


April 10, 2008
On 10/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>  But if I can't declare it as 'pure', then I can't
>  call it from f and g:

No, you can't.


>  So should there be a way to define createIncreasingSequence such that I can
>  call it from a pure function?

I would say no.


>  Or do I have to re-implement it in every pure
>  function I use?

You can make a pure function to return an immutable array, and then duplicate it within f and g.


>  Not having the ability to pass around mutable heap data to and from pure
>  functions is going to limit severely the usefulness of pure functions.

I don't see why. Functional programming languages such as Haskell seem not to mind. In many functional programming languages, there is no mutable data at all.
April 10, 2008
"Janice Caron" wrote
> On 10/04/2008, Steven Schveighoffer wrote:
>>  Or do I have to re-implement it in every pure
>>  function I use?
>
> You can make a pure function to return an immutable array, and then duplicate it within f and g.

So your solution is, create a duplicate array that is never used again, thereby increasing the runtime (and memory allocation) of the pure function, and lowering the performance of the application.  I can't wait to use functional programming, it sounds awesome :)

>>  Not having the ability to pass around mutable heap data to and from pure
>>  functions is going to limit severely the usefulness of pure functions.
>
> I don't see why. Functional programming languages such as Haskell seem not to mind. In many functional programming languages, there is no mutable data at all.

Of course functional programming languages have mutable data.  They just don't have SHARED data.  We need immutable data in D because D is not a functional language and can share data, so that data needs to be immutable. But data that isn't shared shouldn't need to be immutable.

So let me restate: not having the ability to pass around UNIQUE mutable heap data to and from pure functions is going to limit severely the usefulness of pure functions.

-Steve


April 10, 2008
On 10/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> So your solution is, create a duplicate array that is never used again,

If you need to modify it, then it needs to be mutable, so a heap allocation is unavoidable in any case.

>  thereby increasing the runtime

No, /decreasing/ the runtime, because the construction of the invariant array only has to happen ONCE. (Conceptually, it happens every time you call the function, but that's the beauty of functional programming. The compiler is able to cache things and reuse them).


> Of course functional programming languages have mutable data.

Haskell doesn't.

>  They just
>  don't have SHARED data.

Haskell has no mutable data at all, only constants. In Haskell, you cannot do

    x = x + 1

because x, once assigned, keeps its value until it goes out of scope. Now, I'm not saying that D needs to be that extreme. I'm only saying that to get the benefits of FP, you have to hand over a certain amount of control to the compiler, to the language. You no longer get to choose the order of evaluation. You cannot be certain what will and what won't be caller. You are no longer in /any/ position to estimate execution time on the basis of what will or will not be called, or when, or in what order.

Assuming the compiler is smart enough to make all the optimizations that FP permits it to make, this is a plus, not a minus.


>  We need immutable data in D because

No one is arguing against that position.


>  But data that isn't shared shouldn't need to be immutable.

Or that one.


>  So let me restate: not having the ability to pass around UNIQUE mutable heap
> data to and from pure functions is going to limit severely the usefulness of
>  pure functions.

I disagree - partly, because the compiler cannot statically verify uniqueness, so you would immediately be compromising the ability of FP to strongly typecheck everything, thereby robbing the compiler of the guarantees that it needs to ensure fast and crash-free multiprocessing. But mostly because FP requires a different way of thinking about things, and trying to force it to accomodate non-FP paradigms because they sit well with the way you think about problems just isn't playing the game. You have to trust that the compiler a little more. It can optimise better - you just have to keep your fingers crossed that it actually will!
April 10, 2008
Steven Schveighoffer, el 10 de abril a las 10:42 me escribiste:
> So let me restate: not having the ability to pass around UNIQUE mutable heap data to and from pure functions is going to limit severely the usefulness of pure functions.

This is the exact same problem of the mutable methods of a class used in a stack allocated temporary of a pure function, talked in another thread.

class A {
	int x;
	void f() { x = 1; }
}

pure void g()
{
	scope a = new A;
	a.f();
}

g() is clearly pure (it has no side effects, besides allocating memory in
the stack, which shouldn't be considered a side effect, I guess allocating
in the heap shouldn't be considered a side effect either, since new is
thread safe). Even when A.f() is not pure, nor const.

There is an implementation problem, about how hard is to detect that by the compiler, but in theory there is no reason for g() not being pure.

-- 
Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/
----------------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------------
Parece McGuevara's o CheDonald's
April 10, 2008
Janice Caron wrote:
>>  Not having the ability to pass around mutable heap data to and from pure
>>  functions is going to limit severely the usefulness of pure functions.
> 
> I don't see why. Functional programming languages such as Haskell seem
> not to mind. In many functional programming languages, there is no
> mutable data at all.

That's not quite true, and I think that the distinction is important. Haskell and the rest are constantly doing memory allocation, and that memory allocation has the potential to fail when you run out of memory.  It's simply that they hide it all behind their syntax.

Thus, I would argue that it is perfectly reasonable for a "pure" D function to allocate memory.  The question arises, then, what happens when you run out?  A pure function probably shouldn't be allowed to throw an exception (that would violate the pureness guarantees), so what do you do?  Or do you create a "PureFunctionFailedException" which is the only thing that a pure function might throw???
April 10, 2008
"Janice Caron" wrote
> On 10/04/2008, Steven Schveighoffer wrote:
>> So your solution is, create a duplicate array that is never used again,
>
> If you need to modify it, then it needs to be mutable, so a heap allocation is unavoidable in any case.

Yes, but this is 2x heap allocation.  For no reason.

>
>>  thereby increasing the runtime
>
> No, /decreasing/ the runtime, because the construction of the invariant array only has to happen ONCE. (Conceptually, it happens every time you call the function, but that's the beauty of functional programming. The compiler is able to cache things and reuse them).

Please re-read the example.  If I only ever call createIncreasingSequence with a set of arguments once, then it's memoizing on data that's never used again.  My supposition is that I have several functions that have similar code that I parameterize into a pure function.

Yes, f, and g can be cached, but that doesn't mean createIncreasingSequence needs to be cached.  What I'm trying to do (which is a common programming practice) is take common pieces of functions and use a separate function to do that piece.  My common piece happens to allocate data.

In other words, if I have pure functions:

f(x)
{
   statement1;
   statement2;
}

g(x)
{
   statement1;
   statement2;
}

Then I should be able to do

h(x)
{
    statement1;
    statement2;
}

f(x)
{
   h(x);
}

g(x)
{
   h(x);
}

No matter what statement 1 or 2 is.  If it happens to include memory allocation, and that is allowed in a pure function, then I should be able to wrap that memory allocation in another function.  If the compiler can't optimize on it inside the pure function, then it can't optimize it in the sub function, but who cares at that point?

>
>
>> Of course functional programming languages have mutable data.
>
> Haskell doesn't.
>
>>  They just
>>  don't have SHARED data.
>
> Haskell has no mutable data at all, only constants. In Haskell, you cannot do
>
>    x = x + 1
>
> because x, once assigned, keeps its value until it goes out of scope. Now, I'm not saying that D needs to be that extreme. I'm only saying that to get the benefits of FP, you have to hand over a certain amount of control to the compiler, to the language. You no longer get to choose the order of evaluation. You cannot be certain what will and what won't be caller. You are no longer in /any/ position to estimate execution time on the basis of what will or will not be called, or when, or in what order.

I've been reading a bit about FP, and I see now that you are correct that mutable data isn't allowed.  But let's not forget that D is not a functional language, so a loop like:

for(int i = 0; i < n; i++)
   statement;

Is legal inside a D pure function, whereas the same code would be achieved through tail-recursion in a functional language.  Having mutable return values and arguments may not prevent the compiler from considering the function pure (then again, it may, that is my question).  But certainly, comparing D to Haskell doesn't prove either case.

I don't think the intent of bolting functional programming optimization opportunities on top of D is going to mean that during pure functions we will have to use a fully FP style.

>>  So let me restate: not having the ability to pass around UNIQUE mutable
>> heap
>> data to and from pure functions is going to limit severely the usefulness
>> of
>>  pure functions.
>
> I disagree - partly, because the compiler cannot statically verify uniqueness, so you would immediately be compromising the ability of FP to strongly typecheck everything, thereby robbing the compiler of the guarantees that it needs to ensure fast and crash-free multiprocessing. But mostly because FP requires a different way of thinking about things, and trying to force it to accomodate non-FP paradigms because they sit well with the way you think about problems just isn't playing the game. You have to trust that the compiler a little more. It can optimise better - you just have to keep your fingers crossed that it actually will!

OK, sure, so that means you believe:

char[] c = new char[5];

is invalid inside a pure function because the compiler cannot verify uniqueness of mutable heap data, right?  But the data returned IS unique. Couldn't there be a way to specify that the data is unique, or at least local?

I consider having to dup data just so I can use it to be wasteful, no matter what paradigm I'm using.

-Steve


April 10, 2008
On 10/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> OK, sure, so that means you believe:

Please don't make claims concerning what I may or may not believe. Only I get to make those claims.

For the record, I do /not/ believe that which you claim I believe.

Feel free to rephrase as "assertion X leads to conclusion Y" or some such, but please don't get personal, or I'm outa here.
April 10, 2008
On 10/04/2008, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>  Then I should be able to do
>
>  h(x)
>  {
>     statement1;
>     statement2;
>  }
>
>  f(x)
>  {
>    h(x);
>  }
>
>  g(x)
>  {
>    h(x);
>  }

What you're asking for is what someone else called "amoral" (slightly pure). You're suggesting that h be pure-/ish/. It doesn't have to be completely pure per se, but it must be pure /enough/ such that when embedded within f or g, f and g remain pure.

I have no idea whether or not D will be able to accomodate this concept. Strictly functional programming doesn't have it. D might or might not. If we do, I suspect it will require another keyword.



>  char[] c = new char[5];
>
> is invalid inside a pure function because the compiler cannot verify
>  uniqueness of mutable heap data, right?

No. As I said, new is special. It's integral to the language.
April 10, 2008
"Janice Caron" wrote
> On 10/04/2008, Steven Schveighoffer wrote:
>> OK, sure, so that means you believe:
>
> Please don't make claims concerning what I may or may not believe. Only I get to make those claims.
>
> For the record, I do /not/ believe that which you claim I believe.
>
> Feel free to rephrase as "assertion X leads to conclusion Y" or some such, but please don't get personal, or I'm outa here.

If you read the post, you would see that it was a question, and assuming that it was true, a further question about why you think that.

"so that means you believe: .... right?"

First, off, this is not an attack.  Second, even if it was an attack, it's not about anything personal.  It's about your beliefs on a language design. Please don't get so defensive, nobody is attacking you personally.  We're all trying to figure this pure function thing out, and I happen to think your statements are inconsistent.  This has nothing to do with what I think of you personally.

-Steve