[Issue 23573] std.bitmanip.bitfields doesn't respect native endianness

Dec 21, 2022

Iain Buclaw

Dec 21, 2022

johanengelen@weka.io

Dec 21, 2022

elpenguino+D@gmail.com

Dec 21, 2022

Dec 22, 2022

Dec 22, 2022

Dec 22, 2022

Dec 22, 2022

elpenguino+D@gmail.com

Dec 22, 2022

johanengelen@weka.io

https://issues.dlang.org/show_bug.cgi?id=23573 johanengelen@weka.io changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |johanengelen@weka.io --- Comment #1 from johanengelen@weka.io --- I thought that `std.bitmanip.bitfields` does not give any guarantees on layout, just that it tries to pack bits in minimal storage space. For example also no guarantees when sending std.bitmanip.bitfields as ubyte[] over network connection to a PC with different OS/arch. Removal of guarantees (like no guarantees of class layout) enables reordering of fields if it is beneficial to prevent accessing multiple bytes/words for multibyte fields: mixin(bitfields!( bool, "flag1" , 1, ubyte, "flag2", 8, // spans 2 bytes or one byte? bool, "sign" , 1)); Because D bitfields are meant to mimic C bitfields, I think it will be very hard to guarantee that the layout of std.bitmanip.bitfields is the same... --

https://issues.dlang.org/show_bug.cgi?id=23573 elpenguino+D@gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |elpenguino+D@gmail.com --- Comment #2 from elpenguino+D@gmail.com --- (In reply to johanengelen from comment #1) > I thought that `std.bitmanip.bitfields` does not give any guarantees on > layout, just that it tries to pack bits in minimal storage space. > For example also no guarantees when sending std.bitmanip.bitfields as > ubyte[] over network connection to a PC with different OS/arch. > Removal of guarantees (like no guarantees of class layout) enables > reordering of fields if it is beneficial to prevent accessing multiple > bytes/words for multibyte fields: > mixin(bitfields!( > bool, "flag1" , 1, > ubyte, "flag2", 8, // spans 2 bytes or one byte? > bool, "sign" , 1)); > > Because D bitfields are meant to mimic C bitfields, I think it will be very hard to guarantee that the layout of std.bitmanip.bitfields is the same... D doesn't need two undefined flavours of bitfields. Some people find it useful to be able to read bitfields from files or the network regardless of cpu architecture. --

https://issues.dlang.org/show_bug.cgi?id=23573 --- Comment #3 from johanengelen@weka.io --- (In reply to elpenguino+D from comment #2) > > D doesn't need two undefined flavours of bitfields. Some people find it useful to be able to read bitfields from files or the network regardless of cpu architecture. OK, fair point. But then the bugreport should be rewritten without comparing it to C bitfields, and instead specify exactly the bit layout as desired in bytes. It's not clear to me what is meant, because (I think) the OP says that the order of bits _within_ a byte is reversed depending on endianness, which does not give the architecture independent behavior that you want. --

December 22, 2022

[Issue 23573] std.bitmanip.bitfields doesn't respect native endianness

Posted by Iain Buclaw

Permalink

Iain Buclaw

Permalink

https://issues.dlang.org/show_bug.cgi?id=23573

--- Comment #4 from Iain Buclaw <ibuclaw@gdcproject.org> ---
(In reply to johanengelen from comment #3)
> (In reply to elpenguino+D from comment #2)
> > 
> > D doesn't need two undefined flavours of bitfields. Some people find it useful to be able to read bitfields from files or the network regardless of cpu architecture.
> 
> OK, fair point.
> But then the bugreport should be rewritten without comparing it to C
> bitfields, and instead specify exactly the bit layout as desired in bytes.
> It's not clear to me what is meant, because (I think) the OP says that the
> order of bits _within_ a byte is reversed depending on endianness, which
> does not give the architecture independent behavior that you want.
Well, the bug is that it *nearly* ended up in Phobos without checking whether it actually worked.

https://github.com/dlang/phobos/pull/8478/files#diff-0f4bb61407d729d1f6d76e25e15eb84b89ceae014ccc887149c53fb2290a29ccR377-R392

std.numeric also uses bitfields to extract the sign, exponent, and significand out of a float and double too.

    mixin(bitfields!(
        T_sig, "significand", precision,
        T_exp, "exponent"   , exponentWidth,
        bool , "sign"       , flags & flags.signed ));

It is not obvious to the observer that the mixin will generate code to match the layout for both big and little endian (depending on which is in effect).

--

https://issues.dlang.org/show_bug.cgi?id=23573 --- Comment #5 from Iain Buclaw <ibuclaw@gdcproject.org> --- (In reply to Iain Buclaw from comment #4) > It is not obvious to the observer that the mixin will generate code to match the layout for both big and little endian (depending on which is in effect). Or more specifically - it's not obvious in the documentation that BigEndian targets actually layout all fields in reverse. https://dlang.org/library/std/bitmanip/bitfields.html """ The bits are filled in the order given by the parameters, starting with the lowest significant bit. """ Strictly, both parts of this sentence contradict each other. Because the lowest significant bit on BigEndian is the "reverse order given". """ Create a bitfield pack of eight bits, which fit in one ubyte. The bitfields are allocated starting from the least significant bit, i.e. x occupies the two least significant bits of the bitfields storage. """ This should be clarified that `x` occupies the two least significant bits on LittleEndian only. --

https://issues.dlang.org/show_bug.cgi?id=23573 --- Comment #6 from johanengelen@weka.io --- (In reply to Iain Buclaw from comment #5) > (In reply to Iain Buclaw from comment #4) > > It is not obvious to the observer that the mixin will generate code to match the layout for both big and little endian (depending on which is in effect). > Or more specifically - it's not obvious in the documentation that BigEndian targets actually layout all fields in reverse. > > https://dlang.org/library/std/bitmanip/bitfields.html > > """ > The bits are filled in the order given by the parameters, starting with the > lowest significant bit. > """ > > Strictly, both parts of this sentence contradict each other. Because the lowest significant bit on BigEndian is the "reverse order given". This depends on the interpretation of "BigEndian". I interpret it as byte endianness, where BigEndian means: each byte is little endian (`value & 0x1` giving the value of the LSB) and bytes are packed big endian (`array[0]` giving the most significant byte). Like wikipedia's interpretation: https://en.wikipedia.org/wiki/Endianness#Overview I agree that clarification in the bitfields documentation is needed on the matter. --

https://issues.dlang.org/show_bug.cgi?id=23573 --- Comment #7 from elpenguino+D@gmail.com --- I think the problem is that 'least significant bit' is undefined here, since the term is defined according to the data, not endianness. --

https://issues.dlang.org/show_bug.cgi?id=23573 --- Comment #8 from johanengelen@weka.io --- (In reply to elpenguino+D from comment #7) > I think the problem is that 'least significant bit' is undefined here, since the term is defined according to the data, not endianness. That's why I wrote this to define what LSB means: `value & 0x1` giving the value of the LSB --

Forums