Thread overview
What is best way to read and interpret binary files?
Nov 19, 2018
welkam
Nov 19, 2018
Neia Neutuladh
Nov 19, 2018
H. S. Teoh
Nov 20, 2018
Neia Neutuladh
Mar 30, 2021
mw
Mar 30, 2021
H. S. Teoh
Nov 20, 2018
welkam
Nov 20, 2018
Stanislav Blinov
Nov 20, 2018
welkam
November 19, 2018
So my question is in subject/title. I want to parse binary file into D structs and cant really find any good way of doing it. What I try to do now is something like this

byte[4] fake_integer;
auto fd = File("binary.data", "r");
fd.rawRead(fake_integer);
int real_integer = *(cast(int*)  fake_integer.ptr);

What I ideally want is to have some kind of c style array and just cast it into struct or take existing struct and populate fields one by one with data from file. Is there a D way of doing it or should I call core.stdc.stdio functions instead?
November 19, 2018
On Mon, 19 Nov 2018 21:30:36 +0000, welkam wrote:
> So my question is in subject/title. I want to parse binary file into D structs and cant really find any good way of doing it. What I try to do now is something like this
> 
> byte[4] fake_integer;
> auto fd = File("binary.data", "r");
> fd.rawRead(fake_integer);
> int real_integer = *(cast(int*)  fake_integer.ptr);
> 
> What I ideally want is to have some kind of c style array and just cast it into struct or take existing struct and populate fields one by one with data from file. Is there a D way of doing it or should I call core.stdc.stdio functions instead?

Nothing stops you from writing:

    SomeStruct myStruct;
    fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);

Standard caveats about byte order and alignment.
November 19, 2018
On Mon, Nov 19, 2018 at 10:14:25PM +0000, Neia Neutuladh via Digitalmars-d-learn wrote:
> On Mon, 19 Nov 2018 21:30:36 +0000, welkam wrote:
> > So my question is in subject/title. I want to parse binary file into D structs and cant really find any good way of doing it. What I try to do now is something like this
> > 
> > byte[4] fake_integer;
> > auto fd = File("binary.data", "r");
> > fd.rawRead(fake_integer);
> > int real_integer = *(cast(int*)  fake_integer.ptr);
> > 
> > What I ideally want is to have some kind of c style array and just cast it into struct or take existing struct and populate fields one by one with data from file. Is there a D way of doing it or should I call core.stdc.stdio functions instead?
> 
> Nothing stops you from writing:
> 
>     SomeStruct myStruct;
>     fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);

Actually, the case is unnecessary, because arrays implicitly convert to void[], and pointers are sliceable.  So all you need is:

	SomeStruct myStruct;
	fd.rawRead((&myStruct)[0 .. 1]);

This works for all POD types.

Writing the struct out to file is the same thing:

	SomeStruct myStruct;
	fd.rawWrite((&myStruct)[0 .. 1]);

with the nice symmetry that you just have to rename rawRead to rawWrite.

For arrays:

	SomeStruct[] arr;
	fd.rawWrite(arr);
	...

	arr.length = ... /* expected length */
	fd.rawRead(arr);

To correctly store length information, you'll have to manually write out array lengths as well, and read it before reading the array. Should be straightforward to figure out.


> Standard caveats about byte order and alignment.

Alignment shouldn't be a problem, since local variables should already be properly aligned.

Endianness, however, will be a problem if you intend to transport this data to/from a different platform / hardware.  You'll need to manually fix the endianness yourself.


T

-- 
This is not a sentence.
November 20, 2018
On Mon, 19 Nov 2018 14:32:55 -0800, H. S. Teoh wrote:
>> Standard caveats about byte order and alignment.
> 
> Alignment shouldn't be a problem, since local variables should already be properly aligned.

Right, and the IO layer probably doesn't need to read to aligned memory anyway.

Struct fields, however, need to have the same relative alignment as the file.
November 20, 2018
On Monday, 19 November 2018 at 22:14:25 UTC, Neia Neutuladh wrote:
>
> Nothing stops you from writing:
>
>     SomeStruct myStruct;
>     fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);
>
> Standard caveats about byte order and alignment.

Never would I thought about casting struct to static array. If I understood correctly you cast myStruct pointer to ubyte pointer and then construct static array on stack with tmpArray.ptr = (ubyte pointer) and tmpArray.sizeof = SomeStruct.sizeof

What I figured out when I woke up is that I never needed c style arrays. What I could do is to allocate enough data for all file in ubyte array and just use slices to read data by chunks and cast them into necessary structs.

Thanks Neia Neutuladh and H. S. Teoh for giving me some pointers
https://www.explainxkcd.com/wiki/index.php/138:_Pointers
November 20, 2018
On Tuesday, 20 November 2018 at 11:54:59 UTC, welkam wrote:
> On Monday, 19 November 2018 at 22:14:25 UTC, Neia Neutuladh wrote:
>>
>> Nothing stops you from writing:
>>
>>     SomeStruct myStruct;
>>     fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);
>>
>> Standard caveats about byte order and alignment.
>
> Never would I thought about casting struct to static array. If I understood correctly you cast myStruct pointer to ubyte pointer and then construct static array on stack with tmpArray.ptr = (ubyte pointer) and tmpArray.sizeof = SomeStruct.sizeof

Almost correct, except it's not a static array, it's just a slice, i.e. ubyte[].
November 20, 2018
On Tuesday, 20 November 2018 at 12:01:49 UTC, Stanislav Blinov wrote:
> On Tuesday, 20 November 2018 at 11:54:59 UTC, welkam wrote:
>> On Monday, 19 November 2018 at 22:14:25 UTC, Neia Neutuladh wrote:
>>>
>>> Nothing stops you from writing:
>>>
>>>     SomeStruct myStruct;
>>>     fd.rawRead((cast(ubyte*)&myStruct)[0..SomeStruct.sizeof]);
>>>
>>> Standard caveats about byte order and alignment.
>>
>> Never would I thought about casting struct to static array. If I understood correctly you cast myStruct pointer to ubyte pointer and then construct static array on stack with tmpArray.ptr = (ubyte pointer) and tmpArray.sizeof = SomeStruct.sizeof
>
> Almost correct, except it's not a static array, it's just a slice, i.e. ubyte[].

I guess it came from inseparability with C where you want to slice C arrays? Thats useful to know
March 30, 2021
On Monday, 19 November 2018 at 22:32:55 UTC, H. S. Teoh wrote:
> Actually, the case is unnecessary, because arrays implicitly convert to void[], and pointers are sliceable.  So all you need is:
>
> 	SomeStruct myStruct;
> 	fd.rawRead((&myStruct)[0 .. 1]);
>
> This works for all POD types.
>
> Writing the struct out to file is the same thing:
>
> 	SomeStruct myStruct;
> 	fd.rawWrite((&myStruct)[0 .. 1]);


This works, but I'm just wondering why we do not just add more functions to the library:

 rawRead(ref T t), and
 rawWrite(ref T t)

to read & write single value.


> For arrays:
>
> 	SomeStruct[] arr;
> 	fd.rawWrite(arr);
> 	...
>
> 	arr.length = ... /* expected length */
> 	fd.rawRead(arr);

Currently, the library only have this two functions for arrays.


March 29, 2021
On Tue, Mar 30, 2021 at 12:32:36AM +0000, mw via Digitalmars-d-learn wrote:
> On Monday, 19 November 2018 at 22:32:55 UTC, H. S. Teoh wrote:
> > Actually, the case is unnecessary, because arrays implicitly convert to void[], and pointers are sliceable.  So all you need is:
> > 
> > 	SomeStruct myStruct;
> > 	fd.rawRead((&myStruct)[0 .. 1]);
> > 
> > This works for all POD types.
> > 
> > Writing the struct out to file is the same thing:
> > 
> > 	SomeStruct myStruct;
> > 	fd.rawWrite((&myStruct)[0 .. 1]);
> 
> 
> This works, but I'm just wondering why we do not just add more functions to the library:
> 
>  rawRead(ref T t), and
>  rawWrite(ref T t)
> 
> to read & write single value.

If you wish, submit a PR for this.

It's not hard to write your own overloads for it, though:

	void rawWrite(File f, ref T t) @trusted
	{
		f.rawWrite((cast(ubyte*) &t)[0 .. T.sizeof]);
	}

	// ditto for rawRead


T

-- 
A linguistics professor was lecturing to his class one day.
"In English," he said, "A double negative forms a positive. In some
languages, though, such as Russian, a double negative is still a
negative. However, there is no language wherein a double positive can
form a negative."
A voice from the back of the room piped up, "Yeah, yeah."