View mode: basic / threaded / horizontal-split · Log in · Help
January 29, 2003
Advanced features (for future)
Hello.

It would be very good to be able to save classes to disk in a safe 
manner, so that (maybe only public?) fields can be saved and then read 
in, even if a class has been sublassed or expanded (not too hard, with 
current memory model), or even if the underlying machine is different 
(hard). But even saving would probably become much harder if powerful 
data reordering for arrays of classes is implemented.

For this i thing a special problem are Unions. A smart union type has to 
be introduced(switch?), which would keep information on active field, 
and thus provide debugging capabilities. BTW, a parsing library and many 
other usages would draw profit of such a "switch", being shorter to 
write and easier to maintain than a union.

Another useful thing is ML-style pattern matching which i have already 
wished. I was thinking about possible implementation, but then i got 
busy with other things. Yesterday i stumbled over a document describing 
*exactly this* - a C++ extention for this feature. I have only looked 
briefly at the document. Maybe their syntax is overbent, but it might be 
worth a look anyway.

http://citeseer.nj.nec.com/leung96cbased.html

-i.
January 29, 2003
Re: Advanced features (for future)
Hi, Ilya.

Ilya Minkov wrote:
> Hello.
> 
> It would be very good to be able to save classes to disk in a safe 
> manner, so that (maybe only public?) fields can be saved and then read 
> in, even if a class has been sublassed or expanded (not too hard, with 
> current memory model), or even if the underlying machine is different 
> (hard). But even saving would probably become much harder if powerful 
> data reordering for arrays of classes is implemented.
> 
> For this i thing a special problem are Unions. A smart union type has to 
> be introduced(switch?), which would keep information on active field, 
> and thus provide debugging capabilities. BTW, a parsing library and many 
> other usages would draw profit of such a "switch", being shorter to 
> write and easier to maintain than a union.

Some of the code gerators we use at work automatically create binary 
load and save functions.  In the early 90's we used them at QuickLogic, 
but we ran into difficulties maintaining binary backwards compatibility 
with our simple binary dumps.  We also found that a simple memory image 
of binary data structures typically takes up more space than a carefully 
designed ASIC format (which takes up more than a carefully designed 
binary format).

As a result, no one has used the binary load/save feature in a decade. 
It sounds cool. I even wrote code in one of the generators to do it.  It 
just hasn't been as usefull as I thought it would be.

Instead of building functions like binary load/save into the language, 
I'd recommend providing the hooks for users to do it with code 
generators.  Even if there's no direct generation capability in the 
language, there are a few things that could make D work better than C++ 
does with code generators.  In particular:

- Having a way to split up class definitions into multiple parts.

For example, an 'extend' keyword in front of a class could mean we're 
adding to an existing class.  This isn't inheritance.  We'd be modifying 
a class directly rather than creating a new one.

- Do the same thing for modules, functions, variables, and class methods.

It's kind of nice for code generator to be able to put a few fields 
here, add a few statements there, and add a couple functions to an 
existing module.  For example, the auto-generated recursive destructors 
we use were hell to write for C.  Every kind of class relationship 
supported had to be considered in big switch statements to generate all 
the different parts of the function.  Really ugly.  When targetting a 
language that supports these after-the-fact extensions, the complexity 
of the code gerator was reduced tremendously.  The same code that adds 
fields to the parent and child classes also adds a few statements to the 
recursive destructor.  It's much nicer.

Extensions like these allow code generators like ClassWizard to simply 
add files to your project, and not need to modify your hand written 
files.  No more parsing the whole language to do a simple generator.  No 
mor ugly /* !!! Do not edit this !!! */ machine generated crud in my files.

If you were to go the whole 9 yards, you might also allow a similar 
feature:  not just extensions... replacement!  You could use something 
like a replace keyword in front of your module or class or function or 
method or variable.

With this syntax, you can add little edit files to your projects that 
fix problems in a library you've been handed.

For example, if you run into a performance problem with a third party 
library (like that never happens ;-)) and track it down to the use of a 
singly linked list instead of doubly linked, you type a few lines of 
code in a patch file, and problem solved!  For the next ten years that 
it takes your library vendor to get around to fixing the problem, you 
have a work around that usually works with their new releases.

What do you think?

Bill Cox
January 29, 2003
Re: Advanced features (for future)
Ilya Minkov wrote:
> It would be very good to be able to save classes to disk in a safe 
> manner, so that (maybe only public?) fields can be saved and then read 
> in, even if a class has been sublassed or expanded (not too hard, with 
> current memory model), or even if the underlying machine is different 
> (hard). But even saving would probably become much harder if powerful 
> data reordering for arrays of classes is implemented.

This is in DLI under the pickle.d module.  It transfers a class field 
image, so new and reordered fields don't matter, and handles single 
transferrence of pointers, references, and arrays.  The only 
non-portable part is a dependency on IEEE.

> For this i thing a special problem are Unions. A smart union type has to 
> be introduced(switch?), which would keep information on active field, 
> and thus provide debugging capabilities. BTW, a parsing library and many 
> other usages would draw profit of such a "switch", being shorter to 
> write and easier to maintain than a union.

Unions don't get serialisation.  If you want to save a union, save the 
active state.
January 29, 2003
Re: Advanced features (for future)
Burton Radons wrote:
> Ilya Minkov wrote:
> 
>> It would be very good to be able to save classes to disk in a safe 
>> manner, so that (maybe only public?) fields can be saved and then read 
>> in, even if a class has been sublassed or expanded (not too hard, with 
>> current memory model), or even if the underlying machine is different 
>> (hard). But even saving would probably become much harder if powerful 
>> data reordering for arrays of classes is implemented.
> 
> 
> This is in DLI under the pickle.d module.  It transfers a class field 
> image, so new and reordered fields don't matter, and handles single 
> transferrence of pointers, references, and arrays.  The only 
> non-portable part is a dependency on IEEE.
> 

Cool. Thanks.
So it handles endianness.

> 
> Unions don't get serialisation.  If you want to save a union, save the 
> active state.
> 

OK...
But do you doubt usefulness of a switching union?


Thanks a lot.

-i.
March 05, 2003
Re: Advanced features (for future)
Hello. Sorry it took me that long to become aware of this post. :)

Comments embedded.

-i.

Bill Cox wrote:
> Hi, Ilya.
> 
> Ilya Minkov wrote:
> 
>> Hello.
>>
>> It would be very good to be able to save classes to disk in a safe 
>> manner, so that (maybe only public?) fields can be saved and then read 
>> in, even if a class has been sublassed or expanded (not too hard, with 
>> current memory model), or even if the underlying machine is different 
>> (hard). But even saving would probably become much harder if powerful 
>> data reordering for arrays of classes is implemented.
>>
>> For this i thing a special problem are Unions. A smart union type has 
>> to be introduced(switch?), which would keep information on active 
>> field, and thus provide debugging capabilities. BTW, a parsing library 
>> and many other usages would draw profit of such a "switch", being 
>> shorter to write and easier to maintain than a union.
> 
> 
> Some of the code gerators we use at work automatically create binary 
> load and save functions.  In the early 90's we used them at QuickLogic, 
> but we ran into difficulties maintaining binary backwards compatibility 
> with our simple binary dumps.  We also found that a simple memory image 
> of binary data structures typically takes up more space than a carefully 
> designed ASIC format (which takes up more than a carefully designed 
> binary format).

Hm. You have mentioned dynamic properties a while ago. With them, you 
probably wouldn't have such difficulties.
There also has to be some framework, which would allow extending the 
format, even if the serialisation code is written manually. A basic 
support for it would include that a basic class has a (stub) method for 
converting it into the stream of data (.Serialize ?, analogous to 
current ToHash and ToString). You would then implement this method in 
the simplest case with statements like "serstream ~ 
thisproperty.Serialize". This would also imply that .Serialize is 
implemented in the basic types. Analogous about reading.

Languages with dynamic only object methods seem to have this one problem 
less. However, implicit serialisation sequence would also allow to 
interpret some data, which cannot be represented in the object directly 
due to changes.

As to the framework, XML is one example of it. I consideer it though 
appropriate for such things, i would also prefer to have an equivalent 
binary format (with conversion utilities back and forth), since it would 
work faster and take up less space.

BTW, i could make such an XML-like framework... make a function like 
ToXMLData, which would be overloaded for basic types. A user can 
overload it for his own types. And for classes, it should take the 
corresponding method of a class. It should be doable with interfaces. 
Then a way to compose one XMLData of many and to save it all in binary, 
or convert it into real XML.

And i have to consider the Pizza contest. Don't expect much though since 
i'm not the major brain here and i'm only 20, i just started to study 
CS. And since i *never* eat at Pizza Hut, but rather in Restaurant 
Italy, Asado Steak, and some others. I still have over 100 restaurants 
to explore. :)

> As a result, no one has used the binary load/save feature in a decade. 
> It sounds cool. I even wrote code in one of the generators to do it.  It 
> just hasn't been as usefull as I thought it would be.

For static languages binary dumps are much less useful that to dynamic ones.

> Instead of building functions like binary load/save into the language, 
> I'd recommend providing the hooks for users to do it with code 
> generators.  Even if there's no direct generation capability in the 
> language, there are a few things that could make D work better than C++ 
> does with code generators.  In particular:
> 
> - Having a way to split up class definitions into multiple parts.
> 
> For example, an 'extend' keyword in front of a class could mean we're 
> adding to an existing class.  This isn't inheritance.  We'd be modifying 
> a class directly rather than creating a new one.
> 
> - Do the same thing for modules, functions, variables, and class methods.
> 
> It's kind of nice for code generator to be able to put a few fields 
> here, add a few statements there, and add a couple functions to an 
> existing module.  For example, the auto-generated recursive destructors 
> we use were hell to write for C.  Every kind of class relationship 
> supported had to be considered in big switch statements to generate all 
> the different parts of the function.  Really ugly.  When targetting a 
> language that supports these after-the-fact extensions, the complexity 
> of the code gerator was reduced tremendously.  The same code that adds 
> fields to the parent and child classes also adds a few statements to the 
> recursive destructor.  It's much nicer.

These are all good ideas. Also consider, that one could possibly have 
very few classes in the application, but very many methods to add to 
them. Then it would make sense to split up the class across multiple 
files for easy navigation and editing. This means however, that all 
these units have to be compiled simultaneously. Dependencies can be 
awful to track.

> Extensions like these allow code generators like ClassWizard to simply 
> add files to your project, and not need to modify your hand written 
> files.  No more parsing the whole language to do a simple generator.  No 
> mor ugly /* !!! Do not edit this !!! */ machine generated crud in my files.
> 
> If you were to go the whole 9 yards, you might also allow a similar 
> feature:  not just extensions... replacement!  You could use something 
> like a replace keyword in front of your module or class or function or 
> method or variable.

Ouch.

> With this syntax, you can add little edit files to your projects that 
> fix problems in a library you've been handed.
> 
> For example, if you run into a performance problem with a third party 
> library (like that never happens ;-)) and track it down to the use of a 
> singly linked list instead of doubly linked, you type a few lines of 
> code in a patch file, and problem solved!  For the next ten years that 
> it takes your library vendor to get around to fixing the problem, you 
> have a work around that usually works with their new releases.

Cool :)

> What do you think?
> 
> Bill Cox
>
March 06, 2003
Re: Advanced features (for future)
> And i have to consider the Pizza contest. Don't expect much though since 
> i'm not the major brain here and i'm only 20, i just started to study 
> CS. And since i *never* eat at Pizza Hut, but rather in Restaurant 
> Italy, Asado Steak, and some others. I still have over 100 restaurants 
> to explore. :)

You've got a lot of knowledge about computer languages for being only 
20.  Pretty impressive.  I'm 39, just old enough to have actually had a 
job programming in Fortran on a PDP-11/45.

-- Bill
March 06, 2003
Re: Advanced features (for future)
I>> We also found that a simple memory image 
>> of binary data structures typically takes up more space than a carefully 
>> designed ASIC format (which takes up more than a carefully designed 
>> binary format).
>
>Hm. You have mentioned dynamic properties a while ago. With them, 
you 
>probably wouldn't have such difficulties.

I thought a simple example might illustrate the trouble I had with binary 
save formats.  Suppose we're saving a directed graph to disk.  It's classes 
look like:

class Node {
LinkedList<Edge> inEdges, outEdges;
bool visited, marked;
char *name;
}

class Edge {
Node fromNode, toNode;
}

Now, let's assume I have a graph that in a text file would be represented 
as:

A B C
B C E
C A D
D A B C
E B D E

The first colum is node names, and the remaining symbols are 
destinations of edges.  This takes 34 bytes.

If we stream binary to the disk, I assume all Edges and Nodes wind up 
there.  Assume the LinkedList class has a head pointer a name, and two 
Booleans that I could pack into 1 byte.  Each Node would take 7 bytes.  
Each Edge has two Node pointers and two next pointers.  They would take 
16 bytes.

On disk, the simple binary dump takes 5*7 + 12*16 = 227 bytes.  That's a 
whole lot worse than 34 bytes.

As for compatibility, suppose we later on convert our LinkedList 
relationships to DoublyLinkedList.  First, the binary size gets worse, while 
the text file doesn't.  Second, we now have to write converters to be able to 
load the old binary files.  We could gain some backward compatibility by 
using an even larger binary format that tags all the fields, but what's the 
point?  Are we trying to be efficient, or just trying to avoid writing a parser?

File size isn't important for most apps.  Look at how large MS Word files 
are.  No one cares.  I work with design files representing .13u chips.  A 
small file for us migh be 100 meg.  Not only does the text version reduce 
the size, but our users demand text so they can hack our data structurs 
with Perl scripts.

Bill
March 09, 2003
Re: Advanced features (for future)
"Bill Cox" <bill@viasic.com> wrote in message
news:3E6734FB.5060406@viasic.com...
> I'm 39, just old enough to have actually had a
> job programming in Fortran on a PDP-11/45.

Been there, done that <g>.
March 09, 2003
Re: Advanced features (for future)
"Bill Cox" <Bill_member@pathlink.com> wrote in message
news:b47fib$5io$1@digitaldaemon.com...
> File size isn't important for most apps.  Look at how large MS Word files
> are.  No one cares.  I work with design files representing .13u chips.  A
> small file for us migh be 100 meg.  Not only does the text version reduce
> the size, but our users demand text so they can hack our data structurs
> with Perl scripts.

You hit on a big advantage with text files - they can be checked visually
for correctness, and can be editted with ordinary text editors. Binary files
require a custom dumper/editor to be written.

One reason I don't use .doc files is because I need a specific version of
the word processor installed to read them. 20 years from now, who will have
that? (Yes, I have 20 year old files I still use.) With ascii text format,
I'm covered.
Top | Discussion index | About this forum | D home