September 22, 2009
Jarrett Billingsley:

> Or you could - I dunno - cache the result of the dynamic cast in a local, since doing multiple dynamic casts is terrible style anyway. Just saying. ;)

Yes, most of the optimizations done by the compiler can instead be done by the programmer (but some stupid code can be generated by macros, mixins, etc).

Bye,
bearophile
September 22, 2009
On Tue, 22 Sep 2009 14:03:28 -0400, Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:

>
> Or you could - I dunno - cache the result of the dynamic cast in a
> local, since doing multiple dynamic casts is terrible style anyway.
>
> Just saying. ;)

Yes, this is true of many optimizations :P

Are you saying it's bad style because of the expense of dynamic casting or for some other reason?  If dynamic cast were pure, that takes away that argument.

-Steve
September 22, 2009
bearophile wrote:
> Jarrett Billingsley:
> 
>> Or you could - I dunno - cache the result of the dynamic cast in a
>> local, since doing multiple dynamic casts is terrible style anyway.
>> Just saying. ;)
> 
> Yes, most of the optimizations done by the compiler can instead be done by the programmer (but some stupid code can be generated by macros, mixins, etc).
> 
> Bye,
> bearophile

But the compiler has no means to determine that the two calls to _d_dynamic_cast will return the same pointer.
September 22, 2009
On Tue, Sep 22, 2009 at 3:35 PM, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
> On Tue, 22 Sep 2009 14:03:28 -0400, Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:
>
>>
>> Or you could - I dunno - cache the result of the dynamic cast in a local, since doing multiple dynamic casts is terrible style anyway.
>>
>> Just saying. ;)
>
> Yes, this is true of many optimizations :P
>
> Are you saying it's bad style because of the expense of dynamic casting or for some other reason?  If dynamic cast were pure, that takes away that argument.

Dynamic downcasts do usually indicate a weakness in the design. But an even more fundamental matter of style is to not repeat yourself. You don't use "x + y" if you need the sum of x and y in ten places; you do "sum = x + y" and then use "sum." The same applies here. You're not just working around a deficiency in the compiler, you're saving yourself work later if you need to change all those values, and you're giving it some kind of semantic attachment by putting it in a named location.

Besides, you're probably going to be doing:

if(auto d = cast(Derived)baseRef)
{
    // ..
}

so you've already bound the result of the dynamic cast to a variable. *shrug*
September 22, 2009
On Tue, 22 Sep 2009 16:00:29 -0400, Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:

> On Tue, Sep 22, 2009 at 3:35 PM, Steven Schveighoffer
> <schveiguy@yahoo.com> wrote:
>> On Tue, 22 Sep 2009 14:03:28 -0400, Jarrett Billingsley
>> <jarrett.billingsley@gmail.com> wrote:
>>
>>>
>>> Or you could - I dunno - cache the result of the dynamic cast in a
>>> local, since doing multiple dynamic casts is terrible style anyway.
>>>
>>> Just saying. ;)
>>
>> Yes, this is true of many optimizations :P
>>
>> Are you saying it's bad style because of the expense of dynamic casting or
>> for some other reason?  If dynamic cast were pure, that takes away that
>> argument.
>
> Dynamic downcasts do usually indicate a weakness in the design. But an
> even more fundamental matter of style is to not repeat yourself. You
> don't use "x + y" if you need the sum of x and y in ten places; you do
> "sum = x + y" and then use "sum." The same applies here. You're not
> just working around a deficiency in the compiler, you're saving
> yourself work later if you need to change all those values, and you're
> giving it some kind of semantic attachment by putting it in a named
> location.

What if x and y are possibly changing in between calls to x + y?

I'm thinking of a situation where o *might* be changing, but also might not, the compiler could do some optimization to avoid the dynamic-cast calls in the cases where it doesn't change.  These would be hard to code manually.  My understanding of pure function benefits (and it's not that great) is that you can express yourself how you want and the compiler fixes your code to be more optimized knowing it can avoid calls.

> Besides, you're probably going to be doing:
>
> if(auto d = cast(Derived)baseRef)
> {
>     // ..
> }
>
> so you've already bound the result of the dynamic cast to a variable. *shrug*

Probably.  But even then, if this statement is in a loop that might not change baseRef, the compiler can avoid dynamic casting again.

-Steve
September 22, 2009
On Tue, Sep 22, 2009 at 4:13 PM, Steven Schveighoffer <schveiguy@yahoo.com> wrote:
>> Dynamic downcasts do usually indicate a weakness in the design. But an even more fundamental matter of style is to not repeat yourself. You don't use "x + y" if you need the sum of x and y in ten places; you do "sum = x + y" and then use "sum." The same applies here. You're not just working around a deficiency in the compiler, you're saving yourself work later if you need to change all those values, and you're giving it some kind of semantic attachment by putting it in a named location.
>
> What if x and y are possibly changing in between calls to x + y?
>
> I'm thinking of a situation where o *might* be changing, but also might not, the compiler could do some optimization to avoid the dynamic-cast calls in the cases where it doesn't change.  These would be hard to code manually.  My understanding of pure function benefits (and it's not that great) is that you can express yourself how you want and the compiler fixes your code to be more optimized knowing it can avoid calls.

Realistically, how often is this going to come up? Why the hell are we looking at what amounts to CSEE on a rarely-used construct when there are far more important performance issues? I understand wanting to solve this problem for pedagogical reasons, but practically, I don't see the benefit.

>> Besides, you're probably going to be doing:
>>
>> if(auto d = cast(Derived)baseRef)
>> {
>>    // ..
>> }
>>
>> so you've already bound the result of the dynamic cast to a variable. *shrug*
>
> Probably.  But even then, if this statement is in a loop that might not change baseRef, the compiler can avoid dynamic casting again.

Then you'd hoist the conditional out of the loop. Again, not specific to dynamic downcasts.
September 22, 2009
Jeremie Pelletier wrote:
> You can clearly see all there is to a dynamic cast is simply adjusting the pointer to the object's virtual table. So a compiler knowing the source and destination offset of the vtable can easily inline the code for such a cast.

That may be correct, but you're describing it in a way that confuses me, probably in part because it assumes a fair bit of knowledge on the part of the listener.

Objects are laid out like this:

class vtbl pointer
object.Object fields
superclass fields
superclass interface1 vtbl pointer
superclass interface2 vtbl pointer
...
class fields
class interface1 vtbl pointer
class interface2 vtbl pointer
...

(The relative order of fields and interface vtbl pointers doesn't matter.)

You need a special vtbl for interfaces because a virtual function call works like:
branch (object.vtbl + offset)

The offset must be known at compile time. And it has to be the same offset for all possible objects implementing this interface. One implemented interface might require an entirely different vtbl layout than another. The solution is to use an entirely different vtbl.
September 23, 2009
On Tue, 22 Sep 2009 17:56:46 -0400, Jarrett Billingsley <jarrett.billingsley@gmail.com> wrote:

> On Tue, Sep 22, 2009 at 4:13 PM, Steven Schveighoffer
> <schveiguy@yahoo.com> wrote:
>>> Dynamic downcasts do usually indicate a weakness in the design. But an
>>> even more fundamental matter of style is to not repeat yourself. You
>>> don't use "x + y" if you need the sum of x and y in ten places; you do
>>> "sum = x + y" and then use "sum." The same applies here. You're not
>>> just working around a deficiency in the compiler, you're saving
>>> yourself work later if you need to change all those values, and you're
>>> giving it some kind of semantic attachment by putting it in a named
>>> location.
>>
>> What if x and y are possibly changing in between calls to x + y?
>>
>> I'm thinking of a situation where o *might* be changing, but also might not,
>> the compiler could do some optimization to avoid the dynamic-cast calls in
>> the cases where it doesn't change.  These would be hard to code manually.
>>  My understanding of pure function benefits (and it's not that great) is
>> that you can express yourself how you want and the compiler fixes your code
>> to be more optimized knowing it can avoid calls.
>
> Realistically, how often is this going to come up? Why the hell are we
> looking at what amounts to CSEE on a rarely-used construct when there
> are far more important performance issues? I understand wanting to
> solve this problem for pedagogical reasons, but practically, I don't
> see the benefit.

Why have pure functions at all?  Seriously, all pure function reorderings and reuse can be rewritten by human optimization.  If we aren't going to look for places that pure functions can help optimize, why add them to the language, it seems more trouble than its worth?

If all it takes to optimize dynamic casts is to put pure on the function signature, have we wasted that much time?

-Steve
September 23, 2009
On Tue, Sep 22, 2009 at 8:48 PM, Steven Schveighoffer <schveiguy@yahoo.com> wrote:

> Why have pure functions at all?  Seriously, all pure function reorderings and reuse can be rewritten by human optimization.  If we aren't going to look for places that pure functions can help optimize, why add them to the language, it seems more trouble than its worth?
>
> If all it takes to optimize dynamic casts is to put pure on the function signature, have we wasted that much time?

But dynamic downcasting *isn't* pure, unless you can prove that the reference that you're downcasting is unique.

class Base {}
class Derived : Base {}

struct S
{
	Object o;

	Derived get()
	{
		return cast(Derived)o;
	}
}

void main()
{
	S s;
	s.o = new Base();
	writeln(s.get());
	s.o = new Derived();
	writeln(s.get());
}

Dynamic downcasts are not pure. Simply. That's why they're *dynamic*. Without some kind of uniqueness typing, you cannot prove anything about the validity of such casts until runtime.
September 23, 2009

Steven Schveighoffer wrote:
> On Tue, 22 Sep 2009 05:24:11 -0400, Daniel Keep <daniel.keep.lists@gmail.com> wrote:
> 
>>
>>
>> Jason House wrote:
>>> Dynamic casts are pure. They don't use global state, and have the same output for the same reference as input. Interestingly, dynamic cast results are independent of intervening mutable calls... So there's even greater opportunity for optimization.
>>
>> What if the GC just happens to re-use that address for a different object?
> 
> That's only if memoization is used.

Well, if you're not looking at memoization, why is this discussion even happening?

> I think what Jason said is correct -- you can view dynamic cast as taking 2 arguments, one is the reference which is simply echoed as the return value, and one is the classinfo, which is an immutable argument that causes a decision to be made.

The reference *isn't* echoed as a return value.  If it was, then casting wouldn't do anything.

> In fact, you could use memoization on a dynamic cast subfunction that memoizes on the target type and the classinfo of the source, regardless of the reference value, and returns an offset to add to the return value.

Yes, that would work, provided both references are immutable.

> It's pretty easy for the compiler to prove that o has not be reassigned, so even without memoization, the compiler can logically assume that the result from the first dynamic cast can be reused.  I think this is the major optimization for pure functions anyways, not memoization.

Pretty easy?  So you're ignoring threads, then?

The ONLY way I can see object casting being treated as a pure function is if all references passed in are immutable and even then only if the compiler can prove the reference cannot be changed (i.e. the reference is always kept on the stack AND you assume you never manually delete it).

> -Steve

  -- Daniel