Jump to page: 1 2
Thread overview
A bit of binary I/O
Jan 20, 2007
Heinz
Jan 20, 2007
Heinz
Jan 20, 2007
Heinz
Jan 21, 2007
Heinz
Jan 21, 2007
janderson
Jan 21, 2007
Heinz
Jan 21, 2007
Heinz
Jan 21, 2007
janderson
Jan 21, 2007
Heinz
Jan 25, 2007
Christian Kamm
January 20, 2007
Hi guys, i'm having great fun writing and reading binary files. It's my first time doing this and i've got a few questions in mind.
I write the same data(1 ulong and 1 string, i call them primitives) in 3 different ways and i get a different output for one of them. I create 1 file per method. If you open the created file with an hex editor you can see this.

The first way is to write primitives manually one by one:

// primitive way
ulong i = 9;
char[] s = "hello world";
myFile.writeExact(&i, i.sizeof);
myFile.writeExact(&s, s.sizeof);

Reading data:
// Is done by reading each primitive.
ulong i2; char[] s2;
myFile.readExact(&i2, i2.sizeof);
myFile.readExact(&s2, s2.sizeof);



The second way is to write a structure with all the primitives as members:

// struct way
struct t
{
	ulong i;
	char[] s;
}

t mt;
mt.i = 9;
mt.s = "hello world";
myFile.writeExact(&mt, mt.sizeof);

Reading data:
// We read the entire struct.
t mt2;
myFile.readExact(&mt2, mt2.sizeof);



And the third way is to write a class with all the primitives as members:

// class way
class tt
{
	ulong i;
	char[] s;
}

tt mtt = new tt();
mtt.i = 9;
mtt.s = "hello world";
ResFile.writeExact(&mtt, mtt.sizeof);

Reading data:
// We read the entire class.
tt mtt2;
myFile.readExact(&mtt2, mtt2.sizeof);



All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs:

// Primitive

09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00

// Structure

09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00

// Class

C0 3F 91 00

My questions are:

1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way.
2) Wich method is the faster in retrieving data?
3) How the hell does this work? I mean, the string s is 10 chars long but the first 2 methods uses only 8 bytes to store the string and most of them are 0. Even more interesting, look at the class method, it uses only 4 bytes to store about 18 bytes of real data! WTF.
I'm really ?

This is a very interesting subject to me and if someone could clear my mind i would apreciate it very much.

Thx you very very much in advance.

Heinz

January 20, 2007
> // Primitive
> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00

09 00 00 00 00 00 00 00 // the ulong with value 9
0B 00 00 00             // arraysize 11
A0 C7 41 00             // pointervalue to the start of data

> // Structure
> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00

same here

> // Class
> C0 3F 91 00

the first 4 bytes of your class. mtt.sizeof is the size of the reference not the size of the object itself.

s.ptr is the pointer to the array data.
&s is the address of the struct, that holds the array length and the
pointer to the data.
To write the string, you might want to try this:
	myFile.writeExact( s.ptr, s.length );

January 20, 2007
"Heinz" <billgates@microsoft.com> wrote in message news:eou69k$8tf$1@digitaldaemon.com...

> The first way is to write primitives manually one by one:
>
> // primitive way
> ulong i = 9;
> char[] s = "hello world";
> myFile.writeExact(&i, i.sizeof);
> myFile.writeExact(&s, s.sizeof);
>
> Reading data:
> // Is done by reading each primitive.
> ulong i2; char[] s2;
> myFile.readExact(&i2, i2.sizeof);
> myFile.readExact(&s2, s2.sizeof);

You're writing the string wrong.  All you're doing is writing the length and pointer of the array data, without actually writing the data.

The Stream class (and by extension, the File class) provides functions for writing out every basic type:

ulong i = 9;
char[] s = "hello world";
myFile.write(i);
myFile.write(s);

...
ulong i2;
char[] s2;
myFile.read(i2);
myFile.read(s);

> The second way is to write a structure with all the primitives as members:
>
> // struct way
> struct t
> {
> ulong i;
> char[] s;
> }
>
> t mt;
> mt.i = 9;
> mt.s = "hello world";
> myFile.writeExact(&mt, mt.sizeof);
>
> Reading data:
> // We read the entire struct.
> t mt2;
> myFile.readExact(&mt2, mt2.sizeof);

Again, you're just writing out the array reference without writing its contents.  You have to write out each member individually.  If there were no reference types in the struct, this would work fine.

>
> And the third way is to write a class with all the primitives as members:
>
> // class way
> class tt
> {
> ulong i;
> char[] s;
> }
>
> tt mtt = new tt();
> mtt.i = 9;
> mtt.s = "hello world";
> ResFile.writeExact(&mtt, mtt.sizeof);
>
> Reading data:
> // We read the entire class.
> tt mtt2;
> myFile.readExact(&mtt2, mtt2.sizeof);
>

This is incorrect, and is only working because of how you've written your program.  You're not writing the data out at all, you're writing a class reference.  The 00913FC0 is just the memory address of the class instance that mtt points to, and when you read that address back in, you're just looking at the data in memory.  This program wouldn't work if you write the file, exited, then had another program that read the data.  You'd end up with a memory access violation, and none of the data in the class is actually written out.

If you want to write a class out to a file, a common way is to have some kind of generic "serialize" and "unserialize" functions for the class:

class C
{
    ulong i;
    char[] s;

    void serialize(Stream s)
    {
        s.write(i);
        s.write(s);
    }

    static C unserialize(Stream s)
    {
        C c = new C();
        s.read(c.i);
        s.read(c.s);
        return c;
    }
}

...
C c = new C();
c.i = 5;
c.s = "foo";
c.serialize(myFile);

...

C c = C.unserialize(myFile);

>
> All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs:
>
> // Primitive
>
> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
>
> // Structure
>
> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
>
> // Class
>
> C0 3F 91 00
>
> My questions are:
>
> 1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way.

As explained before, the class method is wrong, and there is no encryption going on here.  It's just a memory address, and you should never, ever write memory addresses to a file.

That being said, the best way is probably to just use the primitive .read and .write methods of File.  Just .. never, ever write pointers or references of any kind to a file.

> 2) Wich method is the faster in retrieving data?

If you implement them correctly, all three sample programs should make the exact same output file using the same number of writes (and read it in the same number of reads), and so they are all the same in terms of performance.


January 20, 2007
Frank Benoit (keinfarbton) Wrote:

> > // Primitive
> > 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
> 
> 09 00 00 00 00 00 00 00 // the ulong with value 9
> 0B 00 00 00             // arraysize 11
> A0 C7 41 00             // pointervalue to the start of data
> 
> > // Structure
> > 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
> 
> same here
> 
> > // Class
> > C0 3F 91 00
> 
> the first 4 bytes of your class. mtt.sizeof is the size of the reference not the size of the object itself.
> 
> s.ptr is the pointer to the array data.
> &s is the address of the struct, that holds the array length and the
> pointer to the data.
> To write the string, you might want to try this:
> 	myFile.writeExact( s.ptr, s.length );
> 

I get it, but if i'm actually writing the address of my data and not the data itself then why i'm able to retrieve the data even if it's not there?
January 20, 2007
> I get it, but if i'm actually writing the address of my data and not the data itself then why i'm able to retrieve the data even if it's not there?

Hehe, this works because the string is still in memory. And then you read back the pointer address from the file, and overwrite the other array data ptr with it. Now s2 points to the data of s.

If you do the read in a second program run, it will probably not work.
January 20, 2007
Heinz Wrote:

> Frank Benoit (keinfarbton) Wrote:
> 
> > > // Primitive
> > > 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
> > 
> > 09 00 00 00 00 00 00 00 // the ulong with value 9
> > 0B 00 00 00             // arraysize 11
> > A0 C7 41 00             // pointervalue to the start of data
> > 
> > > // Structure
> > > 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
> > 
> > same here
> > 
> > > // Class
> > > C0 3F 91 00
> > 
> > the first 4 bytes of your class. mtt.sizeof is the size of the reference not the size of the object itself.
> > 
> > s.ptr is the pointer to the array data.
> > &s is the address of the struct, that holds the array length and the
> > pointer to the data.
> > To write the string, you might want to try this:
> > 	myFile.writeExact( s.ptr, s.length );
> > 
> 
> I get it, but if i'm actually writing the address of my data and not the data itself then why i'm able to retrieve the data even if it's not there?

I think i'm getting it, the data retrieved are addresses to the start of data but in my RAM, so if i take this file to another computer the data received should be different, am i right?

To solve this and write the real data you suggest using the .ptr, is this property available in every object.

I'm sorry to bother you so much Frank: I'm interested in your oppinion about the other 2 questions.

Really thanks man, you rule.
January 21, 2007
Jarrett Billingsley Wrote:

> "Heinz" <billgates@microsoft.com> wrote in message news:eou69k$8tf$1@digitaldaemon.com...
> 
> > The first way is to write primitives manually one by one:
> >
> > // primitive way
> > ulong i = 9;
> > char[] s = "hello world";
> > myFile.writeExact(&i, i.sizeof);
> > myFile.writeExact(&s, s.sizeof);
> >
> > Reading data:
> > // Is done by reading each primitive.
> > ulong i2; char[] s2;
> > myFile.readExact(&i2, i2.sizeof);
> > myFile.readExact(&s2, s2.sizeof);
> 
> You're writing the string wrong.  All you're doing is writing the length and pointer of the array data, without actually writing the data.
> 
> The Stream class (and by extension, the File class) provides functions for writing out every basic type:
> 
> ulong i = 9;
> char[] s = "hello world";
> myFile.write(i);
> myFile.write(s);
> 
> ...
> ulong i2;
> char[] s2;
> myFile.read(i2);
> myFile.read(s);
> 
> > The second way is to write a structure with all the primitives as members:
> >
> > // struct way
> > struct t
> > {
> > ulong i;
> > char[] s;
> > }
> >
> > t mt;
> > mt.i = 9;
> > mt.s = "hello world";
> > myFile.writeExact(&mt, mt.sizeof);
> >
> > Reading data:
> > // We read the entire struct.
> > t mt2;
> > myFile.readExact(&mt2, mt2.sizeof);
> 
> Again, you're just writing out the array reference without writing its contents.  You have to write out each member individually.  If there were no reference types in the struct, this would work fine.
> 
> >
> > And the third way is to write a class with all the primitives as members:
> >
> > // class way
> > class tt
> > {
> > ulong i;
> > char[] s;
> > }
> >
> > tt mtt = new tt();
> > mtt.i = 9;
> > mtt.s = "hello world";
> > ResFile.writeExact(&mtt, mtt.sizeof);
> >
> > Reading data:
> > // We read the entire class.
> > tt mtt2;
> > myFile.readExact(&mtt2, mtt2.sizeof);
> >
> 
> This is incorrect, and is only working because of how you've written your program.  You're not writing the data out at all, you're writing a class reference.  The 00913FC0 is just the memory address of the class instance that mtt points to, and when you read that address back in, you're just looking at the data in memory.  This program wouldn't work if you write the file, exited, then had another program that read the data.  You'd end up with a memory access violation, and none of the data in the class is actually written out.
> 
> If you want to write a class out to a file, a common way is to have some kind of generic "serialize" and "unserialize" functions for the class:
> 
> class C
> {
>     ulong i;
>     char[] s;
> 
>     void serialize(Stream s)
>     {
>         s.write(i);
>         s.write(s);
>     }
> 
>     static C unserialize(Stream s)
>     {
>         C c = new C();
>         s.read(c.i);
>         s.read(c.s);
>         return c;
>     }
> }
> 
> ...
> C c = new C();
> c.i = 5;
> c.s = "foo";
> c.serialize(myFile);
> 
> ...
> 
> C c = C.unserialize(myFile);
> 
> >
> > All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs:
> >
> > // Primitive
> >
> > 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
> >
> > // Structure
> >
> > 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
> >
> > // Class
> >
> > C0 3F 91 00
> >
> > My questions are:
> >
> > 1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way.
> 
> As explained before, the class method is wrong, and there is no encryption going on here.  It's just a memory address, and you should never, ever write memory addresses to a file.
> 
> That being said, the best way is probably to just use the primitive .read and .write methods of File.  Just .. never, ever write pointers or references of any kind to a file.
> 
> > 2) Wich method is the faster in retrieving data?
> 
> If you implement them correctly, all three sample programs should make the exact same output file using the same number of writes (and read it in the same number of reads), and so they are all the same in terms of performance.
> 
> 

Wow, that covers all, thanks for your reply.

But, can i still write an entire structure with writeExact()? or you suggest writting each member of the structure with write()?

Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data?

Thanks man
January 21, 2007
Heinz wrote:
> Jarrett Billingsley Wrote:
> 
> 
> Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data?
> 
> Thanks man

You have to use some form of encryption.  XOR encryption is one of the simplest, although not the most secure.  Here's a C doc about it http://www.cprogramming.com/tutorial/xor.html.

Maybe there's already an encryption library in D?

-Joel
January 21, 2007
Heinz wrote:
> Jarrett Billingsley Wrote:
> 
>> "Heinz" <billgates@microsoft.com> wrote in message news:eou69k$8tf$1@digitaldaemon.com...
>>
>>> The first way is to write primitives manually one by one:
>>>
>>> // primitive way
>>> ulong i = 9;
>>> char[] s = "hello world";
>>> myFile.writeExact(&i, i.sizeof);
>>> myFile.writeExact(&s, s.sizeof);
>>>
>>> Reading data:
>>> // Is done by reading each primitive.
>>> ulong i2; char[] s2;
>>> myFile.readExact(&i2, i2.sizeof);
>>> myFile.readExact(&s2, s2.sizeof);
>> You're writing the string wrong.  All you're doing is writing the length and pointer of the array data, without actually writing the data.
>>
>> The Stream class (and by extension, the File class) provides functions for writing out every basic type:
>>
>> ulong i = 9;
>> char[] s = "hello world";
>> myFile.write(i);
>> myFile.write(s);
>>
>> ...
>> ulong i2;
>> char[] s2;
>> myFile.read(i2);
>> myFile.read(s);
>>
>>> The second way is to write a structure with all the primitives as members:
>>>
>>> // struct way
>>> struct t
>>> {
>>> ulong i;
>>> char[] s;
>>> }
>>>
>>> t mt;
>>> mt.i = 9;
>>> mt.s = "hello world";
>>> myFile.writeExact(&mt, mt.sizeof);
>>>
>>> Reading data:
>>> // We read the entire struct.
>>> t mt2;
>>> myFile.readExact(&mt2, mt2.sizeof);
>> Again, you're just writing out the array reference without writing its contents.  You have to write out each member individually.  If there were no reference types in the struct, this would work fine.
>>
>>> And the third way is to write a class with all the primitives as members:
>>>
>>> // class way
>>> class tt
>>> {
>>> ulong i;
>>> char[] s;
>>> }
>>>
>>> tt mtt = new tt();
>>> mtt.i = 9;
>>> mtt.s = "hello world";
>>> ResFile.writeExact(&mtt, mtt.sizeof);
>>>
>>> Reading data:
>>> // We read the entire class.
>>> tt mtt2;
>>> myFile.readExact(&mtt2, mtt2.sizeof);
>>>
>> This is incorrect, and is only working because of how you've written your program.  You're not writing the data out at all, you're writing a class reference.  The 00913FC0 is just the memory address of the class instance that mtt points to, and when you read that address back in, you're just looking at the data in memory.  This program wouldn't work if you write the file, exited, then had another program that read the data.  You'd end up with a memory access violation, and none of the data in the class is actually written out.
>>
>> If you want to write a class out to a file, a common way is to have some kind of generic "serialize" and "unserialize" functions for the class:
>>
>> class C
>> {
>>     ulong i;
>>     char[] s;
>>
>>     void serialize(Stream s)
>>     {
>>         s.write(i);
>>         s.write(s);
>>     }
>>
>>     static C unserialize(Stream s)
>>     {
>>         C c = new C();
>>         s.read(c.i);
>>         s.read(c.s);
>>         return c;
>>     }
>> }
>>
>> ...
>> C c = new C();
>> c.i = 5;
>> c.s = "foo";
>> c.serialize(myFile);
>>
>> ...
>>
>> C c = C.unserialize(myFile);
>>
>>> All of these methods works perfect. I'm able to retrieve values from all of them. Now lets check at the outputs:
>>>
>>> // Primitive
>>>
>>> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
>>>
>>> // Structure
>>>
>>> 09 00 00 00 00 00 00 00 0B 00 00 00 A0 C7 41 00
>>>
>>> // Class
>>>
>>> C0 3F 91 00
>>>
>>> My questions are:
>>>
>>> 1) What's the best method to write data (in terms of data protection/encryption against reversion). The class way seems to me at first look the most secure way.
>> As explained before, the class method is wrong, and there is no encryption going on here.  It's just a memory address, and you should never, ever write memory addresses to a file.
>>
>> That being said, the best way is probably to just use the primitive .read and .write methods of File.  Just .. never, ever write pointers or references of any kind to a file.
>>
>>> 2) Wich method is the faster in retrieving data?
>> If you implement them correctly, all three sample programs should make the exact same output file using the same number of writes (and read it in the same number of reads), and so they are all the same in terms of performance. 
>>
>>
> 
> Wow, that covers all, thanks for your reply.
> 
> But, can i still write an entire structure with writeExact()? or you suggest writting each member of the structure with write()?
> 
> Another question: Writting a type char[] with write() writes string as ASCII? if so then is a legible string, how can i protect that data?
> 
> Thanks man

Well technically it will write it as UTF8, which is as near to ASCII as makes no nevermind.  If you don't want it readable (and this is a binary file anyway) you could just use some simple reversable encryption algorithm.  Something like this for a silly random.

<code>
module silly;

import tango .io .Stdout ;

struct SillyCrypt {

  alias process opCall ;

  static const CHUNK_SIZE = 32_U ;
  static const ROT        = 16_U ;
  static const XOR        = 24_U ;

  static char[] process (char[] src) {
    char[] result ;

    foreach (ch; chunks(src)) {
      result ~= mutate(ch);
    }
    return result;
  }

  private static char[][] chunks (char[] x) {
    char[]   source = x ;
    char[][] result     ;

    while (source.length >= CHUNK_SIZE) {
      result ~= source[0          .. CHUNK_SIZE] ;
      source  = source[CHUNK_SIZE .. $         ] ;
    }
    if (source.length) {
      result ~= source;
    }
    return result;
  }

  private static char[] mutate (char[] x) {
    char[] result ;

    if (x.length > ROT) {
      result = x[ROT .. $] ~ x[0 .. ROT];
    }
    else {
      result = x.dup;
    }
    foreach (inout c; result) {
      c ^= XOR;
    }
    return result;
  }

}

const SOURCE = "I would say hello to you, but you couldn't read it even if I did."c ;

void main () {
  auto enc = SillyCrypt(SOURCE) ;
  auto dec = SillyCrypt(enc   ) ;

  Stdout
    ("Source  -> "c)(SOURCE).newline()
    ("Encrypt -> "c)(enc   ).newline()
    ("Decrypt -> "c)(dec   ).newline()
    .flush
  ;
}
</code>

The output when I tried it was this:
Source  -> I would say hello to you, but you couldn't read it even if I did.
Encrypt -> w8lw8awm48zml8awQ8owmt|8kya8p}ttql8}n}v8q~8Q8|q|m8{wmt|v?l8j}y|86
Decrypt -> I would say hello to you, but you couldn't read it even if I did.

I know I don't personally know anyone who can read "w8lw8awm48zml8awQ8owmt|8kya8p}ttql8}n}v8q~8Q8|q|m8{wmt|v?l8j}y|86" at all.  :)

-- Chris Nicholson-Sauls
January 21, 2007
In C++ you can write an entire structure to a binary file:

http://www.gamedev.net/reference/articles/article1127.asp http://www.codersource.net/cpp_file_io_binary.html

Can you do the same in D?
« First   ‹ Prev
1 2