September 26, 2009
language_fan wrote:
> Sat, 26 Sep 2009 12:25:23 -0400, Jeremie Pelletier thusly wrote:
> 
>> Justin Johansson wrote:
>>> language_fan Wrote:
>>>
>>>> Sat, 26 Sep 2009 09:32:55 -0400, Justin Johansson thusly wrote:
>>>>
>>>>> I've had a good poke around the forums and couldn't find anything on
>>>>> this so ...
>>>>>
>>>>> What's the recommended method for dispatching code off the runtime
>>>>> type of a variant variable (Phobos D2 std.variant)?
>>>>>
>>>>> Does one use a bunch of
>>>>>
>>>>> if ( var.peek!(type1)) { ... }
>>>>> else if ( var.peek!(type2)  { ... }
>>>>>
>>>>> for all N possible types, or is there a better & faster way with a
>>>>> switch or jump table of sorts?
>>>> If the type count gets large, how fast it is depends on the backend
>>>> optimizations of the compiler. In the worst case it is a O(n) time
>>>> linear search. A jump table or almost any other way of dispatching
>>>> would be faster. If the variant had an integral tag field, it could be
>>>> used in a switch; that way the compiler could easily optimize it
>>>> further with the currently available constructs.
>>>>
>>>> This problem is solved in higher level languages by providing pattern
>>>> matching constructs. The compiler is free to optimize the code the way
>>>> it likes:
>>>>
>>>>   case var of
>>>>     type1 => ...
>>>>     type2 => ...
>>>>     ...
>>>>
>>>> But since no C-like language has ever implemented pattern matching, it
>>>> might be too radical to add it to D.
>>> Thanks both for replies.
>>>
>>> I've got about 2 dozen types in the variant so the O(n) really hurts.
>>> The variant thing seemed like a really cool idea at the time but now
>>> ... Without something like suggested above or a computed goto on typeid
>>> or Andrei's visitator, it almost pushes me to backout from using
>>> variants and having to redesign around some common base class or
>>> interface and using virtual function dispatch. :-(
>>>
>>>
>> I see this sort of design in C all the time with event handling,
>> although its with unions rather than discriminated unions, the same
>> logic applies.
>>
>> enum EventType {
>> 	Mouse,
>> 	Key,
>> 	Move,
>> 	...
>> }
>> struct Event {
>> 	EventType type;
>> 	union {
>> 		MouseEvent mouse;
>> 		KeyEvent key;
>> 		MoveEvent move;
>> 		...
>> 	}
>> }
>>
>> void dispatchEvent(const(Event)* event) {
>> 	...
>> 	with(EventType) final switch(event.type) { case Mouse: ...
>> 	case Key: ...
>> 	case Move: ...
>> 	...
>> 	}
>> 	...
>> }
>>
>> That's the logic with standard unions, you should be able to get
>> something similar with variants. It's more code to setup, but you end up
>> with a simple jump table from the switch, and final in D makes it easy
>> to change EventType without forgetting to update dispatchEvent.
> 
> Exactly, this is what I mentioned previously. Isn't it ugly compared to
> 
>   type Event = Mouse | Key | Move;

This can be confusing, for example the first thing that comes to mind for me is that Event is the bitwise OR result of 3 constants, not an enumerated type.

Besides, how is it any different than:

enum { Mouse, Key, Move };

>   void dispatchEvent(Event event) {
>     match(event) {
>       Mouse m => m.squeek();
>       Key k => ...
>       ...
>     }
>   }

match(event) is no different than switch(event), except that pattern matching often implies runtime semantics and is often slower than a straight jump table generated from a switch.

> What's with all the language proposals here? Why hasn't anyone proposed this before? This is in fact very helpful - that's why several modern languages have adopted it.
September 26, 2009
On Sat, Sep 26, 2009 at 4:18 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:

>>  type Event = Mouse | Key | Move;
>
> This can be confusing, for example the first thing that comes to mind for me is that Event is the bitwise OR result of 3 constants, not an enumerated type.
>
> Besides, how is it any different than:
>
> enum { Mouse, Key, Move };

It's not an enumerated constant. Mouse, Key, and Move are all types, and Event is a discriminated union of the three. See more below.

> match(event) is no different than switch(event), except that pattern
> matching often implies runtime semantics and is often slower than a straight
> jump table generated from a switch.

Not in this case. See, when you would do "type Event = Mouse | Key | Move", it's actually more like doing:

struct Event
{
    enum Type { Mouse, Key, Move }
    Type type;
    union
    {
        Mouse m;
        Key k;
        Move v;
    }
}

Then, when you do "match(e) { Mouse m => ... }" it's actually being turned into:

switch(e.type)
{
    case Event.Type.Mouse: alias e.m m; ...
    case Event.Type.Key: alias e.k k; ...
}

Basically discriminated unions get rid of this annoying boilerplate that you have to use every time you want a tagged union.
September 26, 2009
Jarrett Billingsley wrote:
> On Sat, Sep 26, 2009 at 4:18 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
> 
>>>  type Event = Mouse | Key | Move;
>> This can be confusing, for example the first thing that comes to mind for me
>> is that Event is the bitwise OR result of 3 constants, not an enumerated
>> type.
>>
>> Besides, how is it any different than:
>>
>> enum { Mouse, Key, Move };
> 
> It's not an enumerated constant. Mouse, Key, and Move are all types,
> and Event is a discriminated union of the three. See more below.
> 
>> match(event) is no different than switch(event), except that pattern
>> matching often implies runtime semantics and is often slower than a straight
>> jump table generated from a switch.
> 
> Not in this case. See, when you would do "type Event = Mouse | Key |
> Move", it's actually more like doing:
> 
> struct Event
> {
>     enum Type { Mouse, Key, Move }
>     Type type;
>     union
>     {
>         Mouse m;
>         Key k;
>         Move v;
>     }
> }
> 
> Then, when you do "match(e) { Mouse m => ... }" it's actually being turned into:
> 
> switch(e.type)
> {
>     case Event.Type.Mouse: alias e.m m; ...
>     case Event.Type.Key: alias e.k k; ...
> }
> 
> Basically discriminated unions get rid of this annoying boilerplate
> that you have to use every time you want a tagged union.

Oh, that makes sense, but I don't see why you need language support for that, a variant type should be able to get most of it using type tuples, maybe just add support to switch on type tuples along with an opSwitch() method:

alias Algebraic!(Mouse, Key, Move) Event;

final switch(Event) {
case Mouse:
case Key:
case Move:
}

Jeremie
September 26, 2009
On Sat, Sep 26, 2009 at 4:50 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:

> Oh, that makes sense, but I don't see why you need language support for that, a variant type should be able to get most of it using type tuples, maybe just add support to switch on type tuples along with an opSwitch() method:

I'm pretty sure D2's std.typecons has something that almost does just that. I don't know how it handles dispatching though.
September 26, 2009
Sat, 26 Sep 2009 16:50:36 -0400, Jeremie Pelletier thusly wrote:

> Jarrett Billingsley wrote:
>> On Sat, Sep 26, 2009 at 4:18 PM, Jeremie Pelletier <jeremiep@gmail.com> wrote:
>> 
>>>>  type Event = Mouse | Key | Move;
>>> This can be confusing, for example the first thing that comes to mind for me is that Event is the bitwise OR result of 3 constants, not an enumerated type.
>>>
>>> Besides, how is it any different than:
>>>
>>> enum { Mouse, Key, Move };
>> 
>> It's not an enumerated constant. Mouse, Key, and Move are all types, and Event is a discriminated union of the three. See more below.
>> 
>>> match(event) is no different than switch(event), except that pattern
>>> matching often implies runtime semantics and is often slower than a
>>> straight jump table generated from a switch.
>> 
>> Not in this case. See, when you would do "type Event = Mouse | Key | Move", it's actually more like doing:
>> 
>> struct Event
>> {
>>     enum Type { Mouse, Key, Move }
>>     Type type;
>>     union
>>     {
>>         Mouse m;
>>         Key k;
>>         Move v;
>>     }
>> }
>> 
>> Then, when you do "match(e) { Mouse m => ... }" it's actually being
>> turned into:
>> 
>> switch(e.type)
>> {
>>     case Event.Type.Mouse: alias e.m m; ... case Event.Type.Key: alias
>>     e.k k; ...
>> }
>> 
>> Basically discriminated unions get rid of this annoying boilerplate that you have to use every time you want a tagged union.
> 
> Oh, that makes sense, but I don't see why you need language support for that, a variant type should be able to get most of it using type tuples, maybe just add support to switch on type tuples along with an opSwitch() method:
> 
> alias Algebraic!(Mouse, Key, Move) Event;
> 
> final switch(Event) {
> case Mouse:
> case Key:
> case Move:
> }

It does not need language support, but is considered fundamental in some languages, thus offered as built-in construct. Yep, my code was a bit hastily written. Of course I meant the enumerated entities also support subtype relation in an OOP language. For an example, see case classes in Scala. Even in Java enums are more powerful than in D - the only advantage D has is the conversion rules with primitive types.
September 26, 2009
Jarrett Billingsley Wrote:

> On Sat, Sep 26, 2009 at 10:36 AM, Justin Johansson <procode@adam-dott-com.au> wrote:
> 
> > I've got about 2 dozen types in the variant so the O(n) really hurts.
> > The variant thing seemed like a really cool idea at the time but now ...
> > Without something like suggested above or a computed goto on typeid or Andrei's visitator,
> > it almost pushes me to backout from using variants and having to redesign around some common base class or interface and using virtual function dispatch. :-(
> 
> Ouch. 2 dozen types? No offense, but that sounds like "you're doing it wrong." Revising your strategy is probably for the best.
> 
> What are you doing, anyway?


Sorry I can't go too much into the confidential details of this particular project.  However if you remember when I joined this group, what is it, 3 weeks ago I gave a short bio of myself by way of introduction.

So knowing my background in electrical engineering, maths & physics and being a C++ hacker of 20 or so years, you might well guess that my specialist domain is in industrial control and scientific data analysis.  You might say its about real-time, high-speed gaming (discrete and analog sensor inputs instead of joy sticks) but without any GUI :-)

There is currently a desire to fit a scripting layer over a highly
tuned core application (written in C/C++) that works with
strongly-typed, mostly numeric value-based data.  Typing in this
sense largely means range-checked. So, picking a simple example,
an integer in the range [0..999] is a different data type to another
in the range say [-4000..4000].  Most of the datatypes (size-wise)
fit conveniently into a union of (short, int, float, double), but,
it's the value space of the data that determines its type.

The primitive numeric types are typedef'ed in C++ code. So you might have, eg, typedef int16 sensor1_t; representing the range checked value [0..999] from sensor type 1.  All in all there's about 18 different primitive data types. (btw. this explains the reason for asking the "is typedef an alien in D" question on this forum a few days ago).

More complex datatypes in this system are composed as tuples of the "primitive" datatypes in the system-wide data dictionary.  Accordingly there's a lot of variable-width data and this is demanding on heap allocation.

So that's the background in vanilla terms.  The scripting layer is essentially a DSL.  It's been decided to use a strongly-typed FP paradigm for this exercise and with a leaning towards immutable data.

Hopefully this design decision will make for easier system T&A (test
and acceptance).

Now looking into different language options for constructing this DSL, no question about it; the implementing language must support GC.

So far I've considered Java and Scala but there are performance- crippling JNI issues.  Also given C language interoperability plus GC, right now D is looking like a much better bet.

Besides, aside some compiler bug issues, D looks like a lot more fun :-)


Getting back to core business, both memory and speed criteria are important for this project.  I'm hoping to use primitive data types, rather than wrapping low-level data in an object, to get a performance benefit.

Up until now, Andrei's discriminated union (aka variant) was looking like a good idea, using probably about half the memory(***) compared with the everything-is-an-object approach in Java.  For my application, I'd be using the general Algebraic form rather than the standard Variant to model my 18 primitive data types.

*** btw. Here's some miscellaneous links wrt Java object overhead:

http://www.javamex.com/tutorials/memory/object_memory_usage.shtml

http://devblog.streamy.com/2009/07/24/determine-size-of-java-object-class/

http://www.javaspecialists.eu/archive/Issue142.html


At the end of the day, if I can get this thing to fly using D, it will, at least to me, prove that D is a killer language.


Thanks for listening.

-- Justin Johansson


September 26, 2009
Justin Johansson wrote:
> At the end of the day, if I can get this thing to fly using D, it will,
> at least to me, prove that D is a killer language.

One way or another you will, and if you have difficulties let us hear of them. Practical experience with Algebraic and friends is very valuable.

Andrei
September 27, 2009
Justin Johansson wrote:
> I've had a good poke around the forums and couldn't find anything on this so ...
> 
> What's the recommended method for dispatching code off the runtime type of a variant variable
> (Phobos D2 std.variant)?
> 
> Does one use a bunch of
> 
> if ( var.peek!(type1)) { ... }
> else if ( var.peek!(type2)  { ... }
> 
> for all N possible types, or is there a better & faster way with a switch or jump table of sorts?

Variant should have an accessible typeinfo property. (When I say should, I mean that it would be appropriate for it to have it. I am not saying that I believe it has it.) It should be faster to compare that (or switch on typeinfo.name) than to peek for each type.
September 27, 2009
Andrei Alexandrescu Wrote:

> Justin Johansson wrote:
> > At the end of the day, if I can get this thing to fly using D, it will, at least to me, prove that D is a killer language.
> 
> One way or another you will, and if you have difficulties let us hear of them. Practical experience with Algebraic and friends is very valuable.
> 
> Andrei

Thanks, Andrei, for your very encouraging and supportive comments.

My feeling is that I should stick with the new design using Algebraic even if the current regime means I need to use a long series of if else if else's.

By the time I get towards finishing this project, matters may well have improved in std.variant and then I can easily upgrade my code at no cost.

OTOH, to go backtracking at this time, to one of the old schools of
(1) non-discriminated unions or (2) everything-is-an-object
just to work around a current deficency in std.variant would almost certainly
be an extremely bad software engineering design decision, and having serious
impact upon both runtime performance and code maintainability.

It would also do absolutely nothing to improve the enjoyment and quality of my life :-)

The beauty of my current design is that it appropriately addresses the 80/20 rule of where time and money should be spent in a project of this nature.

Further, and worth mentioning given another raging thread on this forum at the moment,
it turns out the ensuring type-safety of my design means that NPE's are a thing of the
past (for me at least).  This is due to strong static type checking together with runtime type
validation all for a pretty reasonable cost.

So it's pretty obvious which side of the fence I'm on in the null ref redux debate.

Cheers
Justin

September 27, 2009
Christopher Wright Wrote:

> Justin Johansson wrote:
> > I've had a good poke around the forums and couldn't find anything on this so ...
> > 
> > What's the recommended method for dispatching code off the runtime type of a variant variable (Phobos D2 std.variant)?
> > 
> > Does one use a bunch of
> > 
> > if ( var.peek!(type1)) { ... }
> > else if ( var.peek!(type2)  { ... }
> > 
> > for all N possible types, or is there a better & faster way with a switch or jump table of sorts?
> 
> Variant should have an accessible typeinfo property. (When I say should, I mean that it would be appropriate for it to have it. I am not saying that I believe it has it.) It should be faster to compare that (or switch on typeinfo.name) than to peek for each type.

Thanks, there is and I was considering that.

    TypeInfo type()
    {
        TypeInfo result;
        fptr(OpID.getTypeInfo, null, &result);
        return result;
    }

Would you need a full char-by-char string compare or could you cook it on the basis of a string memory address == comparison?
(Sorry, I'm a veteran C++'er not a fully-fledged D'er yet).