Thread overview
[Issue 23573] std.bitmanip.bitfields doesn't respect native endianness
Dec 21, 2022
Iain Buclaw
Dec 22, 2022
Iain Buclaw
Dec 22, 2022
Iain Buclaw
December 21, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

Iain Buclaw <ibuclaw@gdcproject.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P1                          |P3
                 CC|                            |ibuclaw@gdcproject.org

--
December 21, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

johanengelen@weka.io changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |johanengelen@weka.io

--- Comment #1 from johanengelen@weka.io ---
I thought that `std.bitmanip.bitfields` does not give any guarantees on layout,
just that it tries to pack bits in minimal storage space.
For example also no guarantees when sending std.bitmanip.bitfields as ubyte[]
over network connection to a PC with different OS/arch.
Removal of guarantees (like no guarantees of class layout) enables reordering
of fields if it is beneficial to prevent accessing multiple bytes/words for
multibyte fields:
    mixin(bitfields!(
        bool, "flag1" , 1,
        ubyte, "flag2", 8,  // spans 2 bytes or one byte?
        bool, "sign"  , 1));

Because D bitfields are meant to mimic C bitfields, I think it will be very hard to guarantee that the layout of std.bitmanip.bitfields is the same...

--
December 21, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

elpenguino+D@gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |elpenguino+D@gmail.com

--- Comment #2 from elpenguino+D@gmail.com ---
(In reply to johanengelen from comment #1)
> I thought that `std.bitmanip.bitfields` does not give any guarantees on
> layout, just that it tries to pack bits in minimal storage space.
> For example also no guarantees when sending std.bitmanip.bitfields as
> ubyte[] over network connection to a PC with different OS/arch.
> Removal of guarantees (like no guarantees of class layout) enables
> reordering of fields if it is beneficial to prevent accessing multiple
> bytes/words for multibyte fields:
>     mixin(bitfields!(
>         bool, "flag1" , 1,
>         ubyte, "flag2", 8,  // spans 2 bytes or one byte?
>         bool, "sign"  , 1));
> 
> Because D bitfields are meant to mimic C bitfields, I think it will be very hard to guarantee that the layout of std.bitmanip.bitfields is the same...

D doesn't need two undefined flavours of bitfields. Some people find it useful to be able to read bitfields from files or the network regardless of cpu architecture.

--
December 21, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

--- Comment #3 from johanengelen@weka.io ---
(In reply to elpenguino+D from comment #2)
> 
> D doesn't need two undefined flavours of bitfields. Some people find it useful to be able to read bitfields from files or the network regardless of cpu architecture.

OK, fair point.
But then the bugreport should be rewritten without comparing it to C bitfields,
and instead specify exactly the bit layout as desired in bytes. It's not clear
to me what is meant, because (I think) the OP says that the order of bits
_within_ a byte is reversed depending on endianness, which does not give the
architecture independent behavior that you want.

--
December 22, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

--- Comment #4 from Iain Buclaw <ibuclaw@gdcproject.org> ---
(In reply to johanengelen from comment #3)
> (In reply to elpenguino+D from comment #2)
> > 
> > D doesn't need two undefined flavours of bitfields. Some people find it useful to be able to read bitfields from files or the network regardless of cpu architecture.
> 
> OK, fair point.
> But then the bugreport should be rewritten without comparing it to C
> bitfields, and instead specify exactly the bit layout as desired in bytes.
> It's not clear to me what is meant, because (I think) the OP says that the
> order of bits _within_ a byte is reversed depending on endianness, which
> does not give the architecture independent behavior that you want.
Well, the bug is that it *nearly* ended up in Phobos without checking whether it actually worked.

https://github.com/dlang/phobos/pull/8478/files#diff-0f4bb61407d729d1f6d76e25e15eb84b89ceae014ccc887149c53fb2290a29ccR377-R392

std.numeric also uses bitfields to extract the sign, exponent, and significand out of a float and double too.

    mixin(bitfields!(
        T_sig, "significand", precision,
        T_exp, "exponent"   , exponentWidth,
        bool , "sign"       , flags & flags.signed ));

It is not obvious to the observer that the mixin will generate code to match the layout for both big and little endian (depending on which is in effect).

--
December 22, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

--- Comment #5 from Iain Buclaw <ibuclaw@gdcproject.org> ---
(In reply to Iain Buclaw from comment #4)
> It is not obvious to the observer that the mixin will generate code to match the layout for both big and little endian (depending on which is in effect).
Or more specifically - it's not obvious in the documentation that BigEndian targets actually layout all fields in reverse.

https://dlang.org/library/std/bitmanip/bitfields.html

"""
The bits are filled in the order given by the parameters, starting with the
lowest significant bit.
"""

Strictly, both parts of this sentence contradict each other.  Because the lowest significant bit on BigEndian is the "reverse order given".

"""
Create a bitfield pack of eight bits, which fit in one ubyte. The bitfields are
allocated starting from the least significant bit, i.e. x occupies the two
least significant bits of the bitfields storage.
"""

This should be clarified that `x` occupies the two least significant bits on LittleEndian only.

--
December 22, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

--- Comment #6 from johanengelen@weka.io ---
(In reply to Iain Buclaw from comment #5)
> (In reply to Iain Buclaw from comment #4)
> > It is not obvious to the observer that the mixin will generate code to match the layout for both big and little endian (depending on which is in effect).
> Or more specifically - it's not obvious in the documentation that BigEndian targets actually layout all fields in reverse.
> 
> https://dlang.org/library/std/bitmanip/bitfields.html
> 
> """
> The bits are filled in the order given by the parameters, starting with the
> lowest significant bit.
> """
> 
> Strictly, both parts of this sentence contradict each other.  Because the lowest significant bit on BigEndian is the "reverse order given".

This depends on the interpretation of "BigEndian". I interpret it as byte endianness, where BigEndian means: each byte is little endian (`value & 0x1` giving the value of the LSB) and bytes are packed big endian (`array[0]` giving the most significant byte). Like wikipedia's interpretation: https://en.wikipedia.org/wiki/Endianness#Overview

I agree that clarification in the bitfields documentation is needed on the matter.

--
December 22, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

--- Comment #7 from elpenguino+D@gmail.com ---
I think the problem is that 'least significant bit' is undefined here, since the term is defined according to the data, not endianness.

--
December 22, 2022
https://issues.dlang.org/show_bug.cgi?id=23573

--- Comment #8 from johanengelen@weka.io ---
(In reply to elpenguino+D from comment #7)
> I think the problem is that 'least significant bit' is undefined here, since the term is defined according to the data, not endianness.

That's why I wrote this to define what LSB means:  `value & 0x1` giving the value of the LSB

--