September 23, 2009
On Tue, 22 Sep 2009 21:21:13 -0400, Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:

> On Tue, Sep 22, 2009 at 8:48 PM, Steven Schveighoffer
> <schveiguy@yahoo.com> wrote:
>
>> Why have pure functions at all?  Seriously, all pure function reorderings
>> and reuse can be rewritten by human optimization.  If we aren't going to
>> look for places that pure functions can help optimize, why add them to the
>> language, it seems more trouble than its worth?
>>
>> If all it takes to optimize dynamic casts is to put pure on the function
>> signature, have we wasted that much time?
>
> But dynamic downcasting *isn't* pure, unless you can prove that the
> reference that you're downcasting is unique.
>
> class Base {}
> class Derived : Base {}
>
> struct S
> {
> 	Object o;
>
> 	Derived get()
> 	{
> 		return cast(Derived)o;
> 	}
> }
>
> void main()
> {
> 	S s;
> 	s.o = new Base();
> 	writeln(s.get());
> 	s.o = new Derived();
> 	writeln(s.get());
> }
>
> Dynamic downcasts are not pure. Simply. That's why they're *dynamic*.
> Without some kind of uniqueness typing, you cannot prove anything
> about the validity of such casts until runtime.

You cannot prove what they will return until runtime, but that doesn't mean they are not pure.

In your example, you are calling dynamic cast with two different references, so you'll get two different results.  The compiler has to be able to prove that the dynamic cast call is to the same object, which it can't by looking at the code for S.

Imagine a pure function foo, and then a normal function bar:

pure int foo(n) {...}

bar(int n)
{
   int x = foo(n);
}

Can any optimizations be made here?  No, unless you count memoization.  But we've already established that memoization isn't useful for dynamic cast, as it isn't useful for any function that takes a mutable reference.  It's also difficult to say that memoization is worth the effort without profiling.  But if you have the code:

bar(int n)
{
  int x = foo(n);
  x += foo(n);
}

Then you can damn sure optimize out one of those foos.  n didn't change.  Same goes for dynamic cast.

Part of dynamic cast is easy to be proven pure.  If you imagine dynamic cast to be the function:

T dynamicCast(T)(U u)
{
  int offset = getOffset!T(u.classinfo);
  if(offset == CANNOT_CAST)
     return null;
  return cast(T)((cast(void *)u) + offset);
}

The function getOffset returns a constant CANNOT_CAST if the object cannot be cast from a U to a T, and returns the offset to add to the reference to convert the U to a T otherwise.

Then the function getOffset can be pure, since classinfo's are unique and immutable.  That's the expensive part anyways, the rest can be inlined, and you then then have a pure dynamic cast function, because you are calling dynamicCast with the same parameter over and over.

-Steve
September 23, 2009
Jarrett Billingsley wrote:
> Dynamic downcasts are not pure.

Actually they are.  Given an object 'o', 'cast(T)(<reference to o>)'
always returns the same value (either a reference to 'o' or 'null',
depending on class of 'o' and the value of 'T').

If the reference to 'o' is changed to point to a different object, then the dynamic cast can return a different value.  This does not make the dynamic cast impure, since you are passing in a different argument.

If the object 'o' is destroyed and another object is created at the same
physical memory location, 'cast(T)(<reference to o>)' can return a
different value.  This does not make the dynamic cast impure, since you
are passing in a (logically) different argument (which happens to use
the same physical representation).


-- 
Rainer Deyke - rainerd@eldwood.com
September 23, 2009
On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep <daniel.keep.lists@gmail.com> wrote:

>
>
> Steven Schveighoffer wrote:
>> On Tue, 22 Sep 2009 05:24:11 -0400, Daniel Keep
>> <daniel.keep.lists@gmail.com> wrote:
>>
>>>
>>>
>>> Jason House wrote:
>>>> Dynamic casts are pure. They don't use global state, and have the
>>>> same output for the same reference as input. Interestingly, dynamic
>>>> cast results are independent of intervening mutable calls... So
>>>> there's even greater opportunity for optimization.
>>>
>>> What if the GC just happens to re-use that address for a different
>>> object?
>>
>> That's only if memoization is used.
>
> Well, if you're not looking at memoization, why is this discussion even
> happening?

There are other pure optimizations that can occur without memoization.  In fact I don't even know why memoization is brought up so much with pure functions, it is an impossible-to-prove optimization.  How do you know memoization helps when you don't know how often the same arguments will be used?

>
>> I think what Jason said is correct -- you can view dynamic cast as
>> taking 2 arguments, one is the reference which is simply echoed as the
>> return value, and one is the classinfo, which is an immutable argument
>> that causes a decision to be made.
>
> The reference *isn't* echoed as a return value.  If it was, then casting
> wouldn't do anything.

Well, either the reference itself, an offset reference, or null is returned.  Sorry I wasn't explicit.

>
>> In fact, you could use memoization on a dynamic cast subfunction that
>> memoizes on the target type and the classinfo of the source, regardless
>> of the reference value, and returns an offset to add to the return value.
>
> Yes, that would work, provided both references are immutable.

The classinfo is immutable, it's always the same.  Why does the data reference have to be immutable?  It's not even dereferenced.  Think of it as an integer.

>
>> It's pretty easy for the compiler to prove that o has not be reassigned,
>> so even without memoization, the compiler can logically assume that the
>> result from the first dynamic cast can be reused.  I think this is the
>> major optimization for pure functions anyways, not memoization.
>
> Pretty easy?  So you're ignoring threads, then?
>
> The ONLY way I can see object casting being treated as a pure function
> is if all references passed in are immutable and even then only if the
> compiler can prove the reference cannot be changed (i.e. the reference
> is always kept on the stack AND you assume you never manually delete it).

How is a thread going to come along and change a local variable in another thread?

>
>> -Steve
>
>   -- Daniel

       ---  Steve
September 23, 2009

Rainer Deyke wrote:
> Jarrett Billingsley wrote:
>> Dynamic downcasts are not pure.
> 
> Actually they are.  Given an object 'o', 'cast(T)(<reference to o>)'
> always returns the same value (either a reference to 'o' or 'null',
> depending on class of 'o' and the value of 'T').
> 
> If the reference to 'o' is changed to point to a different object, then the dynamic cast can return a different value.  This does not make the dynamic cast impure, since you are passing in a different argument.

On the basis of how I understand pure to be implemented in D at the moment, it is impure.  Purity only considers the bits passed on the stack.  If the reference points to the same location in memory, it's considered the same argument.

The only way to do what you expect would be to compare not just the arguments, but all memory pointed to by them.  If you've got cyclic references, you're just plain screwed.

> If the object 'o' is destroyed and another object is created at the same
> physical memory location, 'cast(T)(<reference to o>)' can return a
> different value.  This does not make the dynamic cast impure, since you
> are passing in a (logically) different argument (which happens to use
> the same physical representation).

Logically yes, but in terms of how it's actually implemented, no.
September 23, 2009

Steven Schveighoffer wrote:
> On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep <daniel.keep.lists@gmail.com> wrote:
>> Well, if you're not looking at memoization, why is this discussion even happening?
> 
> There are other pure optimizations that can occur without memoization. In fact I don't even know why memoization is brought up so much with pure functions, it is an impossible-to-prove optimization.  How do you know memoization helps when you don't know how often the same arguments will be used?

What other optimisations?  The only other one I can think of off the top of my head is common subexpression elimination.  Except that's effectively caller-side memoisation.

>> The ONLY way I can see object casting being treated as a pure function is if all references passed in are immutable and even then only if the compiler can prove the reference cannot be changed (i.e. the reference is always kept on the stack AND you assume you never manually delete it).
> 
> How is a thread going to come along and change a local variable in another thread?

I don't know if you've noticed, but objects tend to be allocated on the heap, not the stack. :P

There's nothing to stop another thread from killing the object you're pointing to and then substituting in a new one.  It's pathological, yes, but this is the sort of thing that, if it happens, you don't have a hope of debugging.  This is why threads completely and utterly suck and I frequently wish they'd never been invented.

It's also incidentally why I've personally come to regard reference semantics, in general, as a really, REALLY stupid idea.

>>
>>> -Steve
>>
>>   -- Daniel
> 
>        ---  Steve

                            --------- Daniel
September 23, 2009
Daniel Keep wrote:
> On the basis of how I understand pure to be implemented in D at the moment, it is impure.  Purity only considers the bits passed on the stack.  If the reference points to the same location in memory, it's considered the same argument.

As I understand it, D doesn't attempt to do general memoization anyway (and indeed, shouldn't).  Given a reference to an object, if the reference itself is not modified from one dynamic cast to another, then the result of the former cast can be reused for the latter.  The existence of the reference prevents the object from being garbage-collected, and manual deletion results in undefined behavior.



-- 
Rainer Deyke - rainerd@eldwood.com
September 23, 2009
On Tue, 22 Sep 2009 23:25:26 -0400, Daniel Keep <daniel.keep.lists@gmail.com> wrote:

>
>
> Steven Schveighoffer wrote:
>> On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep
>> <daniel.keep.lists@gmail.com> wrote:
>>> Well, if you're not looking at memoization, why is this discussion even
>>> happening?
>>
>> There are other pure optimizations that can occur without memoization.
>> In fact I don't even know why memoization is brought up so much with
>> pure functions, it is an impossible-to-prove optimization.  How do you
>> know memoization helps when you don't know how often the same arguments
>> will be used?
>
> What other optimisations?  The only other one I can think of off the top
> of my head is common subexpression elimination.  Except that's
> effectively caller-side memoisation.

Yes, those optimizations :)  They are easy to do and very different than parameter memoization because you only optimize where you know it makes a difference (and you don't need to use the heap to do it).

>
>>> The ONLY way I can see object casting being treated as a pure function
>>> is if all references passed in are immutable and even then only if the
>>> compiler can prove the reference cannot be changed (i.e. the reference
>>> is always kept on the stack AND you assume you never manually delete it).
>>
>> How is a thread going to come along and change a local variable in
>> another thread?
>
> I don't know if you've noticed, but objects tend to be allocated on the
> heap, not the stack. :P

The pointer *value* (the actual reference, which is stored on the stack) is not being changed.  That's all dynamic cast cares about.  The referenced data is irrelevant (except for the classinfo, whose access can be factored out of the function, and which isn't possible to change without resorting to undefined behavior).

> There's nothing to stop another thread from killing the object you're
> pointing to and then substituting in a new one.  It's pathological, yes,
> but this is the sort of thing that, if it happens, you don't have a hope
> of debugging.  This is why threads completely and utterly suck and I
> frequently wish they'd never been invented.

It's not only pathological, but it's undefined.  We don't care about undefined behavior.  A thread can come through and trounce on any memory at any time, this is also not a case we need to concern ourselves with.

-Steve
September 23, 2009
Daniel Keep wrote:
> 
> Steven Schveighoffer wrote:
>> On Tue, 22 Sep 2009 21:24:10 -0400, Daniel Keep
>> <daniel.keep.lists@gmail.com> wrote:
>>> Well, if you're not looking at memoization, why is this discussion even
>>> happening?
>> There are other pure optimizations that can occur without memoization. In fact I don't even know why memoization is brought up so much with
>> pure functions, it is an impossible-to-prove optimization.  How do you
>> know memoization helps when you don't know how often the same arguments
>> will be used?
> 
> What other optimisations?  The only other one I can think of off the top
> of my head is common subexpression elimination.  Except that's
> effectively caller-side memoisation.
> 
>>> The ONLY way I can see object casting being treated as a pure function
>>> is if all references passed in are immutable and even then only if the
>>> compiler can prove the reference cannot be changed (i.e. the reference
>>> is always kept on the stack AND you assume you never manually delete it).
>> How is a thread going to come along and change a local variable in
>> another thread?
> 
> I don't know if you've noticed, but objects tend to be allocated on the
> heap, not the stack. :P
> 
> There's nothing to stop another thread from killing the object you're
> pointing to and then substituting in a new one.  It's pathological, yes,
> but this is the sort of thing that, if it happens, you don't have a hope
> of debugging.  This is why threads completely and utterly suck and I
> frequently wish they'd never been invented.
> 
> It's also incidentally why I've personally come to regard reference
> semantics, in general, as a really, REALLY stupid idea.

I myself couldn't live without threads or references, but I agree that mixing the two can lead to interesting scenarios to debug. Just try and code a responsive GUI without using threads, the application will "hang" everytime you're not going back to the main loop every few milliseconds. Or doing network I/O without bringing your program to a crawl, of course you could use async I/O but where is that I/O gonna run without a background thread to support it :)

Besides, the upcoming shared qualifier is there to address those issues.

You might want to look at this article, I came across it a few months ago and found it rather insightful, it just could change your view on references:
http://blogs.msdn.com/ericlippert/archive/2009/02/17/references-are-not-addresses.aspx
September 23, 2009

Rainer Deyke wrote:
> Daniel Keep wrote:
>> On the basis of how I understand pure to be implemented in D at the moment, it is impure.  Purity only considers the bits passed on the stack.  If the reference points to the same location in memory, it's considered the same argument.
> 
> As I understand it, D doesn't attempt to do general memoization anyway (and indeed, shouldn't).  Given a reference to an object, if the reference itself is not modified from one dynamic cast to another, then the result of the former cast can be reused for the latter.

Which is memoisation, more or less.

> The
> existence of the reference prevents the object from being
> garbage-collected, and manual deletion results in undefined behavior.

http://digitalmars.com/d/2.0/expression.html#DeleteExpression

Doesn't actually specify whether this case is kosher or not.  I suppose that makes it undefined behaviour, although possibly not in quite the same way.  :)
September 23, 2009
Daniel Keep wrote:
> Rainer Deyke wrote:
>> As I understand it, D doesn't attempt to do general memoization anyway (and indeed, shouldn't).  Given a reference to an object, if the reference itself is not modified from one dynamic cast to another, then the result of the former cast can be reused for the latter.
> 
> Which is memoisation, more or less.

Memoization that can safely is safe for dynamic casts.

> http://digitalmars.com/d/2.0/expression.html#DeleteExpression
> 
> Doesn't actually specify whether this case is kosher or not.  I suppose that makes it undefined behaviour, although possibly not in quite the same way.  :)

So you're deleting one object, leaving a dangling reference.  You're creating a new object that by pure coincidence happens to be stored at the same memory address.  Then you're dereferencing the original dangling reference, which by pure coincidence now points to the new object.  How can this possibly be defined behavior?  You're still dereferencing a dangling reference.


-- 
Rainer Deyke - rainerd@eldwood.com