Jump to page: 1 2
Thread overview
What it the preferred method to write a class to a file?
Jul 23, 2006
Charles D Hixson
Jul 23, 2006
Stewart Gordon
Jul 23, 2006
Charles D Hixson
Jul 23, 2006
Chad J
Jul 23, 2006
Charles D Hixson
Jul 24, 2006
Charles D Hixson
Jul 24, 2006
Regan Heath
Jul 24, 2006
Charles D Hixson
Jul 24, 2006
Regan Heath
Jul 25, 2006
Regan Heath
Jul 24, 2006
xs0
Jul 24, 2006
Charles D Hixson
Jul 24, 2006
Derek
Jul 25, 2006
Charles D Hixson
Jul 25, 2006
BCS
Jul 26, 2006
Charles D Hixson
Jul 26, 2006
kris
Jul 26, 2006
Charles D Hixson
July 23, 2006
I'm thinking that I should build two methods, possibly called write and print where write would export a "binary" version of the class that is not intended to be human readable and print would export something with, e.g., numbers translated into strings.

It appears as if the methods should be instance methods of
the class.  It looks as if write should be implemented
something like:
void write(Stream stream)
{  stream.writeExact(&this, this.sizeof);	}

Though probably in this version I'd want to include delimiters, a type id, and a length to facilitate writing the corresponding read routing (which would need to be a class method).

It this the best approach?  Also are there any suggestions as to a reasonable way to specify the type id, so that it would be the same from run to run?  Should I keep track of the ids myself (manually)?  If not, how would I know on an attempt to read which class the type id referred to?
July 23, 2006
Charles D Hixson wrote:
> I'm thinking that I should build two methods, possibly
> called write and print where write would export a "binary"
> version of the class that is not intended to be human
> readable and print would export something with, e.g.,
> numbers translated into strings.
> 
> It appears as if the methods should be instance methods of
> the class.  It looks as if write should be implemented
> something like:
> void write(Stream stream)
> {  stream.writeExact(&this, this.sizeof);	}

That won't work at all.  Because classes have reference semantics, all it'll do is write out the memory address of the object.

Even so, the arrangement of members within a class isn't guaranteed:

http://www.digitalmars.com/d/class.html
"The D compiler is free to rearrange the order of fields in a class to optimally pack them in an implementation-defined manner. Consider the fields much like the local variables in a function - the compiler assigns some to registers and shuffles others around all to get the optimal stack frame layout. This frees the code designer to organize the fields in a manner that makes the code more readable rather than being forced to organize it according to machine optimization rules. Explicit control of field layout is provided by struct/union types, not classes."

Moreover, every object includes a pointer to the vtable, which will screw things up when the program is run again and the data is read back in.  Add to that any pointers, dynamic arrays or object references that your class may contain....

> Though probably in this version I'd want to include
> delimiters, a type id, and a length to facilitate writing
> the corresponding read routing (which would need to be a
> class method).

You need to define a data format that includes all the information that is needed in order to reconstruct the object.  Start with a type ID of your own devising (if there's any chance that more than one type fits the context), and write out each member.  For primitive types, static arrays and structs that have no reference-semantics members, this is trivial.  For dynamic arrays, write out the length followed by the contents.  For object references within the class, follow this principle recursively.  Be careful of any potential circularity.

> It this the best approach?   Also are there any suggestions
> as to a reasonable way to specify the type id, so that it
> would be the same from run to run? 

Only the one you suggest next:

> Should I keep track of the ids myself (manually)?
<snip>

Yes.

Stewart.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:-@ C++@ a->--- UB@ P+ L E@ W++@ N+++ o K-@ w++@ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on the 'group where everyone may benefit.
July 23, 2006
Stewart Gordon wrote:
> Charles D Hixson wrote:
>> I'm thinking that I should build two methods, possibly called write and print where write would export a "binary" version of the class that is not intended to be human readable and print would export something with, e.g., numbers translated into strings.
>>
>> It appears as if the methods should be instance methods of
>> the class.  It looks as if write should be implemented
>> something like:
>> void write(Stream stream)
>> {  stream.writeExact(&this, this.sizeof);    }
> 
> That won't work at all.  Because classes have reference semantics, all it'll do is write out the memory address of the object.
> 
> Even so, the arrangement of members within a class isn't guaranteed:
> 
> http://www.digitalmars.com/d/class.html
> "The D compiler is free to rearrange the order of fields in a class to
> optimally pack them in an implementation-defined manner. Consider the
> fields much like the local variables in a function - the compiler
> assigns some to registers and shuffles others around all to get the
> optimal stack frame layout. This frees the code designer to organize the
> fields in a manner that makes the code more readable rather than being
> forced to organize it according to machine optimization rules. Explicit
> control of field layout is provided by struct/union types, not classes."
> 
> Moreover, every object includes a pointer to the vtable, which will screw things up when the program is run again and the data is read back in.  Add to that any pointers, dynamic arrays or object references that your class may contain....
> 
>> Though probably in this version I'd want to include delimiters, a type id, and a length to facilitate writing the corresponding read routing (which would need to be a class method).
> 
> You need to define a data format that includes all the information that is needed in order to reconstruct the object.  Start with a type ID of your own devising (if there's any chance that more than one type fits the context), and write out each member.  For primitive types, static arrays and structs that have no reference-semantics members, this is trivial.  For dynamic arrays, write out the length followed by the contents.  For object references within the class, follow this principle recursively.  Be careful of any potential circularity.
> 
>> It this the best approach?   Also are there any suggestions as to a reasonable way to specify the type id, so that it would be the same from run to run?
> 
> Only the one you suggest next:
> 
>> Should I keep track of the ids myself (manually)?
> <snip>
> 
> Yes.
> 
> Stewart.
> 
Thanks. (Sigh...I guess this is all tied in with introspection.)
July 23, 2006
Charles D Hixson wrote:
> I'm thinking that I should build two methods, possibly
> called write and print where write would export a "binary"
> version of the class that is not intended to be human
> readable and print would export something with, e.g.,
> numbers translated into strings.
> 
> It appears as if the methods should be instance methods of
> the class.  It looks as if write should be implemented
> something like:
> void write(Stream stream)
> {  stream.writeExact(&this, this.sizeof);	}
> 
> Though probably in this version I'd want to include
> delimiters, a type id, and a length to facilitate writing
> the corresponding read routing (which would need to be a
> class method).
> 
> It this the best approach?  Also are there any suggestions
> as to a reasonable way to specify the type id, so that it
> would be the same from run to run?  Should I keep track of
> the ids myself (manually)?  If not, how would I know on an
> attempt to read which class the type id referred to?

There was a discussion about this a while ago that had some suggestions, at least for the binary approach:
http://www.digitalmars.com/d/archives/digitalmars/D/37739.html

Hope it helps.
July 23, 2006
Chad J wrote:
> Charles D Hixson wrote:
>>...class method).
>>
>> It this the best approach?  Also are there any suggestions as to a reasonable way to specify the type id, so that it would be the same from run to run?  Should I keep track of the ids myself (manually)?  If not, how would I know on an attempt to read which class the type id referred to?
> 
> There was a discussion about this a while ago that had some suggestions, at least for the binary approach: http://www.digitalmars.com/d/archives/digitalmars/D/37739.html
> 
> Hope it helps.
Well, before I read it I had written (slightly trimmed):
const uint eor = 0x19191919;
class	nnv
{ protected float[] vec;
  const char[4] sig  =  "nvec";
  ulong	write(Stream s)
    in
    {	assert (s.isOpen());
	assert (s.seekable);
    }
    body
    {	ulong	oStart	=	s.position;
	s.write(cast(ulong)0);
	s.write(sig[0]);	s.write(sig[1]);
	s.write(sig[2]);	s.write(sig[3]);
	s.write(cast(ulong)(this.vec.length));
	foreach(float f;	this.vec)	s.write(f);
	s.write(eor);
	ulong	oEnd	=	s.position;
	s.position	=	oStart;
	s.write(cast(ulong)(oEnd - oStart) );
	s.position	=	oEnd;
    }
}

So I guess that we're heading in the same directions (except that for this class a separate struct didn't make much sense).  I should probably calculate the overhead so that I don't need to go back and forth to write the length.  I return the position to make it easy to create an index, and include length, type, and eor to make it feasible to rebuild an index if the current one becomes corrupt.  (Theoretically I shouldn't need the eor...but I grew up with tape parity errors, and anyway any compression method would reduce that.

The problems with this solution come in scaling.  Consider what would happen if one were dealing with many different kinds of record...and some were composite.  Do-able doesn't mean elegant.

This basic design is limited in the number of kinds of record it can handle...but long before that limit is reached it would be paralyzed by the clumsiness.

P.S.:  Is there a standard library routine for converting between strings of length 4 and uint-s?  If so I wasn't able to find it.  If not, I wasn't able to determine that it didn't exist.  (That would have made writing the sig more efficient.)
July 23, 2006
"Charles D Hixson" <charleshixsn@earthlink.net> wrote in message news:ea0rll$5h9$1@digitaldaemon.com...

> P.S.:  Is there a standard library routine for converting between strings of length 4 and uint-s?  If so I wasn't able to find it.  If not, I wasn't able to determine that it didn't exist.  (That would have made writing the sig more efficient.)

You don't even need a routine to do it.

char[] sig = "help";
uint s = *cast(uint*)sig.ptr;

And the other way..

uint s = 0xAABBCCDD;
char[] sig = new char[4];
*cast(uint*)sig.ptr = s;


July 24, 2006
Jarrett Billingsley wrote:
> "Charles D Hixson" <charleshixsn@earthlink.net> wrote in message news:ea0rll$5h9$1@digitaldaemon.com...
> 
>> P.S.:  Is there a standard library routine for converting between strings of length 4 and uint-s?  If so I wasn't able to find it.  If not, I wasn't able to determine that it didn't exist.  (That would have made writing the sig more efficient.)
> 
> You don't even need a routine to do it.
> 
> char[] sig = "help";
> uint s = *cast(uint*)sig.ptr;
> 
> And the other way..
> 
> uint s = 0xAABBCCDD;
> char[] sig = new char[4];
> *cast(uint*)sig.ptr = s;
> 
> 
O, dear.  Yes, I see it.
But one of the things that cause me to prefer D over C is
the ability to avoid pointer manipulation, which to me seem
extremely hazardous and, when one gets beyond simple cases,
quite confusing.  (And unmaintainable.)
July 24, 2006
On Sun, 23 Jul 2006 17:26:57 -0700, Charles D Hixson <charleshixsn@earthlink.net> wrote:
> Jarrett Billingsley wrote:
>> "Charles D Hixson" <charleshixsn@earthlink.net> wrote in message
>> news:ea0rll$5h9$1@digitaldaemon.com...
>>
>>> P.S.:  Is there a standard library routine for converting
>>> between strings of length 4 and uint-s?  If so I wasn't able
>>> to find it.  If not, I wasn't able to determine that it
>>> didn't exist.  (That would have made writing the sig more
>>> efficient.)
>>
>> You don't even need a routine to do it.
>>
>> char[] sig = "help";
>> uint s = *cast(uint*)sig.ptr;
>>
>> And the other way..
>>
>> uint s = 0xAABBCCDD;
>> char[] sig = new char[4];
>> *cast(uint*)sig.ptr = s;
>>
>>
> O, dear.  Yes, I see it.
> But one of the things that cause me to prefer D over C is
> the ability to avoid pointer manipulation, which to me seem
> extremely hazardous and, when one gets beyond simple cases,
> quite confusing.  (And unmaintainable.)

If it works, then I say put it in a function and ignore 'how' it works. More than likely it will be inlined and you'll never need to worry about it again.

Regan
July 24, 2006
On Sun, 23 Jul 2006 17:26:57 -0700, Charles D Hixson <charleshixsn@earthlink.net> wrote:
> Jarrett Billingsley wrote:
>> "Charles D Hixson" <charleshixsn@earthlink.net> wrote in message
>> news:ea0rll$5h9$1@digitaldaemon.com...
>>
>>> P.S.:  Is there a standard library routine for converting
>>> between strings of length 4 and uint-s?  If so I wasn't able
>>> to find it.  If not, I wasn't able to determine that it
>>> didn't exist.  (That would have made writing the sig more
>>> efficient.)
>>
>> You don't even need a routine to do it.
>>
>> char[] sig = "help";
>> uint s = *cast(uint*)sig.ptr;
>>
>> And the other way..
>>
>> uint s = 0xAABBCCDD;
>> char[] sig = new char[4];
>> *cast(uint*)sig.ptr = s;
>>
>>
> O, dear.  Yes, I see it.
> But one of the things that cause me to prefer D over C is
> the ability to avoid pointer manipulation, which to me seem
> extremely hazardous and, when one gets beyond simple cases,
> quite confusing.  (And unmaintainable.)

Some alternatives to consider...

import std.stdio;

uint char_to_uint(char[] str)
{
	return *cast(uint*)str.ptr;
}

uint char_to_uint_a(char[] str)
{
	uint i = 0;
	for(int j = 3; j >= 0; j--) {
		i <<= 8;
		i |= str[j];
	}
	return i;
}

/*
Option #1
Note: .dup is required the local 'i' becomes invalid after the function returns
char[] uint_to_char(uint i)
{
	char[] str = new char[4];
	*cast(uint*)str.ptr = i;
	return str.dup;
}

Option #2
Note: no dup required, we're copying the data to the new array.
char[] uint_to_char(uint i)
{
	char[] str = new char[4];
	str[] = (cast(char*)&i)[0..4];
	return str;
}
*/

//Note: same as #1 but involves 1 less temporary array.
char[] uint_to_char(uint i)
{
	return (cast(char*)&i)[0..4].dup;
}

char[] uint_to_char_a(uint i)
{
	char[] str = new char[4];
	foreach(inout c; str) {
		c = i&0xFF;
		i >>= 8;
	}
	return str;
}

void main()
{
	char[] str = "abcd";
	writefln("%b (%d)",char_to_uint(str),char_to_uint(str));
	writefln("%b (%d)",char_to_uint_a(str),char_to_uint_a(str));
	writefln(uint_to_char(char_to_uint(str)));
	writefln(uint_to_char_a(char_to_uint(str)));
}

Also, have you considered the 'endian' issues of storing data in a binary format. See:
http://en.wikipedia.org/wiki/Endian

Specifically, if you plan to use a file saved on a machine which is little endian, i.e. your typical x86 pentium/amd and transfer it to a big endian machine i.e. a solaris sparc server or powerPC and load it. If you do then you will need to decide on an endian format to save to and load from and therefore perform conversion on one of the systems (and not the other).

To perform conversion you would simply modify the _a functions above to process the characters in reverse order.

Regan
July 24, 2006
> P.S.:  Is there a standard library routine for converting
> between strings of length 4 and uint-s?  If so I wasn't able
> to find it.  If not, I wasn't able to determine that it
> didn't exist.  (That would have made writing the sig more
> efficient.)

union {
    uint asUInt;
    char[4] asChars;
}

or something to that effect :)
« First   ‹ Prev
1 2