June 14, 2009
Hello grauzone,

> BCS wrote:
> 
>>> That's why I'd still require types to be marked as serializeable by
>>> the programmer.
>>> 
>> How would you do that aside from mixins?
>> 
> Make the user implement a marker interface, 

that would work (for classes) but IMHO the side effects of that are more invasive and it still ends up mucking with internal state from the outside

> or let him provide a
> (single) special class member which fulfill the same function,

What's the difference between one and two?

> or
> introduce annotations into the language.

NO, not an option.

> As far as marking goes, a
> mixin would be OK too, but as I said, I don't like adding arbitrary
> members into the user's scope without the user knowing.
> 

how about if what the mixins add is documented?

>>> At least for all
>>> those structs, it's truly annoying and unneeded. Having to use
>>> different mixins for structs/classes sucks even more (wouldn't be
>>> required if you wanted).
>> It isn't required, the difference is not struct vs. class but "do you
>> care about repeated references to instances of this type?"
>> 
> IMHO a relatively useless optimization, that will only confuse the
> user. It will introduce subtle bugs if objects accidentally are
> "repeated" instead of serialized only once.
> 

Well, I can switch the default but, in my experience, most of the time repetition doesn't matter. I also dissagree on the "relatively useless optimization" bit, it adds some not exactly trivial overhead in about 3 or 4 different places.

>>> By the way, how are you going to solve the other problems when
>>> serializing D types? Here are some troublesome types:
>>> - pointers?
>> class references are pointers, for struct and what not, I'll just use
>> a similar approach
>> 
> That's not really the problem here. You have no idea where a pointer
> points to. Is it a pointer to a new'ed memory block? Does it point
> into an object? Does it point into an array? Into the data segment?
> 
> The GC provides no API to find out. You may be able to handle some
> cases above, but not all.
> 

What I can't handle will be documented as unsupported.

>>> - function pointers?
>>> - delegates?
>> I won't
>> 
> Forces the user to use interfaces instead of delegates.

interfaces are not supported either.

>>> - referential integrity across arrays and slices?
>>> 
>> I won't
>> 
> I wonder if anyone (I mean, especially user of a serialization
> library) would disagree with that choice. Sure, there are valid D
> programs that would break with this, but is relying on this really
> clean?
> 

That's what I'm hoping for. D is to low level for setting up serialization to be completely automatic. The programmer will have to be careful how things are set up. My go is to make that as easy as I can.

>> First off there is not enough information to correctly generate
>> sterilizers for types. So the user has to do some thinking for most
>> types. How do I make the user thinks about how and for what types
>> code is being generated?
>> 
>> Option 1) have the template run a paramg(msg,T.stringof) for each
>> type
>> and hope the user don't just ignore the list.
>> Option 2) do nothing and be sure the user won't see anything.
>> Option 3) Make the user write per type code and error when they
>> don't.
>
> Oh, you mean if there are types in the object graphs the serializer
> can't deal with it?

Almost, I think. I'm looking at the case where a new types is /added/ to the the object graph at some point removed form the use of the sterilizer. Someone needs to be informed that new types are being serialized and that they need to make sure it's working correctly.

> But how would option 2) work?

It dosn't


June 14, 2009
BCS wrote:
>> introduce annotations into the language.
> 
> NO, not an option.

What, why? Sure, this is not a realistic option.

> Well, I can switch the default but, in my experience, most of the time repetition doesn't matter. I also dissagree on the "relatively useless 

Oh really?

> optimization" bit, it adds some not exactly trivial overhead in about 3 or 4 different places.

Maybe it costs a hash table lookup, but apart from that, you're saving space and time for marshaling additional instances. Of course, this is different with structs. But structs are value types.

>>>> - function pointers?
>>>> - delegates?
>>> I won't
>>>
>> Forces the user to use interfaces instead of delegates.
> 
> interfaces are not supported either.

But supporting interfaces would be very simple.
June 14, 2009
Hello grauzone,

> BCS wrote:
> 
>>> introduce annotations into the language.
>>> 
>> NO, not an option.
>> 
> What, why? Sure, this is not a realistic option.
> 

D1 is fixed, and D2 will be in the next few months. I'm not going to even think of targeting D3 at this point. I'm writing this to be used, not as a theoretical construct.

>> Well, I can switch the default but, in my experience, most of the
>> time repetition doesn't matter. I also dissagree on the "relatively
>> useless
>> 
> Oh really?
> 

I haven't used a graph data structure in some time. Most of them have been trees. And the cases I can think of, the repeated reference bit has been central the the structure so the chances of getting it wrong (or of missing it under test) are about nil.

>> optimization" bit, it adds some not exactly trivial overhead in about
>> 3 or 4 different places.
>> 
> Maybe it costs a hash table lookup, but apart from that, you're saving
> space and time for marshaling additional instances. Of course, this is
> different with structs. But structs are value types.
> 

which side are you arguing there? 

OTOH pointers to struct are not value types...

>> interfaces are not supported either.
>> 
> But supporting interfaces would be very simple.
> 

It wouldn't be hard in the current form (you would add a mixin to the interface as well) but the non-mixin, outside in approach would have all sorts of interesting issues like how to get the correct sterilizer function.


June 16, 2009
BCS wrote:
>>> Well, I can switch the default but, in my experience, most of the
>>> time repetition doesn't matter. I also dissagree on the "relatively
>>> useless
>>>
>> Oh really?
>>
> 
> I haven't used a graph data structure in some time. Most of them have been trees. And the cases I can think of, the repeated reference bit has been central the the structure so the chances of getting it wrong (or of missing it under test) are about nil.

IMO, most tree-like structures are still full graphs in memory, because they often contain "parent" pointers, or point back to a parent indirectly (e.g. even if a generic tree data structure is implemented without parent pointers, the data element itself might contain such pointers).

> OTOH pointers to struct are not value types...

Pointers are a whole different thing. A pointer can still point to a "value" type, because that struct might be embedded within an object (a class member that's a struct).

>>> interfaces are not supported either.
>>>
>> But supporting interfaces would be very simple.
>>
> 
> It wouldn't be hard in the current form (you would add a mixin to the interface as well) but the non-mixin, outside in approach would have all sorts of interesting issues like how to get the correct sterilizer function.

Huh? You can simple cast the interface to an object. And then cast the object to the serializeable type. You need to be able to do that anyway, because object references can be of the type "Object", and there's no way you'd add your serialize mixin to Object.

Also, is you writing "sterilizer" a typo or not?
June 16, 2009
Hello grauzone,

> BCS wrote:
> 
>>>> Well, I can switch the default but, in my experience, most of the
>>>> time repetition doesn't matter. I also dissagree on the "relatively
>>>> useless
>>>> 
>>> Oh really?
>>> 
>> I haven't used a graph data structure in some time. Most of them have
>> been trees. And the cases I can think of, the repeated reference bit
>> has been central the the structure so the chances of getting it wrong
>> (or of missing it under test) are about nil.
>> 
> IMO, most tree-like structures are still full graphs in memory,
> because they often contain "parent" pointers, or point back to a
> parent indirectly (e.g. even if a generic tree data structure is
> implemented without parent pointers, the data element itself might
> contain such pointers).

I'm referring to data structure that I could add serialization to, e.i. ones where I would know of they have parent references. I still stand by my assertion.

> 
>> OTOH pointers to struct are not value types...
>> 
> Pointers are a whole different thing. A pointer can still point to a
> "value" type, because that struct might be embedded within an object
> (a class member that's a struct).
> 

pointers to members won't be supported any time soon.

>>>> interfaces are not supported either.
>>>> 
>>> But supporting interfaces would be very simple.
>>> 
>> It wouldn't be hard in the current form (you would add a mixin to the
>> interface as well) but the non-mixin, outside in approach would have
>> all sorts of interesting issues like how to get the correct
>> sterilizer function.
>> 
> Huh? You can simple cast the interface to an object.

That is not safe. not all interface instances are D objects.

> And then cast the
> object to the serializeable type.

Cast only works if you know /at compile time/ what type to cast to so I don't think that's going to work. 

> You need to be able to do that
> anyway, because object references can be of the type "Object", and
> there's no way you'd add your serialize mixin to Object.

And that just brought up another issue: how do you serialize a class that only ever shows up as a base class reference? The lib has no way to /find/ the type at compile time so it has no way to generate code to deal with it.

> 
> Also, is you writing "sterilizer" a typo or not?

typo (is it in the lib or just this thread?) I'd be even worse without a spellchecker :(


June 16, 2009
>> Huh? You can simple cast the interface to an object.
> 
> That is not safe. not all interface instances are D objects.

There are people who care for COM and C++ interfaces? COM is Windows specific, and C++ vtables are... uh, I don't know, platform/architecture/compiler vendor specific?

In any case, serializable objects shouldn't contain references to such interfaces in the first place.

> And that just brought up another issue: how do you serialize a class that only ever shows up as a base class reference? The lib has no way to /find/ the type at compile time so it has no way to generate code to deal with it.

But you already handle this. One of your mixins contains a static this ctor (which, btw., makes it impossible to use serializable types in cyclic dependent modules). It seems right now this ctor is only for registering the demarshaller function, but the same can be done with the marshaller function.
June 17, 2009
Hello grauzone,

>>> Huh? You can simple cast the interface to an object.
>>> 
>> That is not safe. not all interface instances are D objects.
>> 
> There are people who care for COM and C++ interfaces? COM is Windows
> specific, and C++ vtables are... uh, I don't know,
> platform/architecture/compiler vendor specific?
> 
> In any case, serializable objects shouldn't contain references to such
> interfaces in the first place.
> 

I think there are ways to have a D interface implemented by a non D object.

>> And that just brought up another issue: how do you serialize a class
>> that only ever shows up as a base class reference? The lib has no way
>> to /find/ the type at compile time so it has no way to generate code
>> to deal with it.
>> 
> But you already handle this. One of your mixins contains a static this
> ctor (which, btw., makes it impossible to use serializable types in
> cyclic dependent modules). It seems right now this ctor is only for
> registering the demarshaller function, but the same can be done with
> the marshaller function.
> 

The demarshaller function is indexed via a string derived from the original object. What would the marshaller function key on? The best I can think of right now is the typeinfo and as of now, that's broken under DLLs

---

The solution I have now works. Unless someone can show an intractable problem with it I'm keeping it. The only significant issue so far voiced is the concern about what all that mixin is adding and I think I can kill that one with some minor refactoring and documentation.

p.s. could you please not delete the citation line from the quote, it makes it easier (at least with good NG clients) for people to find replies to there posts.


June 17, 2009
On Tue, Jun 16, 2009 at 8:50 PM, BCS<none@anon.com> wrote:
> Hello grauzone,
>
>>>> Huh? You can simple cast the interface to an object.
>>>>
>>> That is not safe. not all interface instances are D objects.
>>>
>> There are people who care for COM and C++ interfaces? COM is Windows specific, and C++ vtables are... uh, I don't know, platform/architecture/compiler vendor specific?
>>
>> In any case, serializable objects shouldn't contain references to such interfaces in the first place.
>>
>
> I think there are ways to have a D interface implemented by a non D object.

He listed the only two possibilities: COM objects and extern(C++)
interfaces in D2.

typeid(Interface).classinfo.flags & 1 if an interface is COM at least.
 I don't know if there is any runtime info to indicate whether an
interface is C++ or not.
June 18, 2009
BCS wrote:

> The demarshaller function is indexed via a string derived from the original object. What would the marshaller function key on? The best I can think of right now is the typeinfo and as of now, that's broken under DLLs

DLLs are broken in general. There are many more problems associated with them, and you won't be happy with them. I write all my code with the assumption in mind, that TypeInfos/ClassInfos for the same type always are the same instance.
June 18, 2009
Reply to grauzone,

> BCS wrote:
> 
>> The demarshaller function is indexed via a string derived from the
>> original object. What would the marshaller function key on? The best
>> I can think of right now is the typeinfo and as of now, that's broken
>> under DLLs
>> 
> DLLs are broken in general. There are many more problems associated
> with them, and you won't be happy with them. I write all my code with
> the assumption in mind, that TypeInfos/ClassInfos for the same type
> always are the same instance.
> 

On second pass, even putting DLLs aside, I can't count on typeinfo being the same in both sides because I can't even count on them being in the same process, exe or even under the same compiler, OS or CPU.

Can you get the mangled name of an object instance at runtime via typeinfo?