July 05

On Friday, 5 July 2024 at 19:56:49 UTC, Tim wrote:

>

On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer wrote:

>

What if you need > 32 bits or want to pack into a ulong? Is the behavior sane across compilers?

The following struct has a different layout for different platforms:

...

Thanks for this.

I also tested the following, and found it too shows discrepancies.

struct S {
    unsigned short x;
    unsigned int a : 12;
    unsigned int b : 12;
    unsigned int c : 8;
};

Here there are only uint bitfields, yet the compiler chooses to layout the bits differently based on the preceding field.

Walter, I have to unfortunately withdraw my support for defining D bitfields to just be the same as C bitfields -- the minefields are too subtle. The statement that "If you use uint as the field type, you'll get the same layout across every C compiler" is not true. And I don't think we can really specify the true nature of what you must do for portable bitfields in a way that is straightforward. Saying something like "you can only use uint bitfields in structs that contain only uint types" is not a good feature.

I'm back to requesting that we have a mechanism to request C bitfields (such as marking a struct as extern(C)), or picking one C style and going with that.

-Steve

July 06
On 06/07/2024 9:12 AM, Steven Schveighoffer wrote:
> On Friday, 5 July 2024 at 19:56:49 UTC, Tim wrote:
>> On Friday, 5 July 2024 at 19:35:10 UTC, Steven Schveighoffer wrote:
>>>
>>> What if you need > 32 bits or want to pack into a `ulong`? Is the behavior sane across compilers?
>>
>> The following struct has a different layout for different platforms:
> 
> ...
> 
> Thanks for this.
> 
> I also tested the following, and found it too shows discrepancies.
> 
> ```c
> struct S {
>      unsigned short x;
>      unsigned int a : 12;
>      unsigned int b : 12;
>      unsigned int c : 8;
> };
> ```
> 
> Here there are only `uint` bitfields, yet the compiler chooses to layout the bits differently based on the *preceding* field.
> 
> Walter, I have to unfortunately withdraw my support for defining D bitfields to just be the same as C bitfields -- the minefields are too subtle. The statement that "If you use uint as the field type, you'll get the same layout across every C compiler" is not true. And I don't think we can really specify the true nature of what you must do for portable bitfields in a way that is straightforward. Saying something like "you can only use `uint` bitfields in structs that contain only `uint` types" is not a good feature.
> 
> I'm back to requesting that we have a mechanism to request C bitfields (such as marking a struct as `extern(C)`), or picking one C style and going with that.
> 
> -Steve

I did not expect this.

This prevents my mitigation from working.

So now we also have to put it into an anonymous struct to even get the layout we think it should be.

```c
struct Foo {
     unsigned short x;

     struct {
        unsigned int a : 12;
        unsigned int b : 12;
        unsigned int c : 8;
     };

     //void* next;
};

int main() {
    struct Foo foo;
    foo.a = 1;
    foo.b = 0;

    return 0;
}
```

```asm
main:
 push   rbp
 mov    rbp,rsp
 mov    DWORD PTR [rbp-0x4],0x0
 mov    eax,DWORD PTR [rbp-0x8]
 and    eax,0xfffff000
 or     eax,0x1
 mov    DWORD PTR [rbp-0x8],eax
 mov    eax,DWORD PTR [rbp-0x8]
 and    eax,0xff000fff
 or     eax,0x0
 mov    DWORD PTR [rbp-0x8],eax
 xor    eax,eax
 pop    rbp
 ret
```
July 05
On 7/5/2024 10:02 AM, Timon Gehr wrote:
> On 7/5/24 18:35, Walter Bright wrote:
>>
>> Consider also that the C standard does not specify the size of a 'char'.
> 
> D does specify it.

Yes. And I have no concern at all about some C compiler that uses a different size. None of those C compilers will compile "portable" C code, either, even though the Standard permits such compilers.

If we go though a dimensional warp into an alternate universe, where C chars are  9 bits, we'll change the D compiler to match.
July 05
On 7/5/2024 12:35 PM, Steven Schveighoffer wrote:
> What if you need > 32 bits or want to pack into a `ulong`? Is the behavior sane across compilers?

Yes. The trouble happens when you mix different field types. There are also differences when declaring "packed" bit fields - a C extension that ImportC does not implement.

You can see which cases are different in:

ImportC:

https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsms.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsposix32.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/bitfieldsposix64.c

D:

https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsms.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsposix32.c
https://github.com/dlang/dmd/blob/master/compiler/test/runnable/dbitfieldsposix64.c
July 05
On 7/5/2024 2:12 PM, Steven Schveighoffer wrote:
> I also tested the following, and found it too shows discrepancies.
> 
> ```c
> struct S {
>      unsigned short x;
>      unsigned int a : 12;
>      unsigned int b : 12;
>      unsigned int c : 8;
> };
> ```

The following will also show discrepancies:

```
struct T {
    unsigned short x;
    unsigned int y;
}
```

for the same reason.

> Here there are only uint bitfields, yet the compiler chooses to layout the bits differently based on the preceding field.

It's actually based on the *alignment* of the preceding field. I'm regret not saying that, but that's what I meant with the fields need to be of the same type, so they have the same alignment. If the uint bitfield started off aligned at a uint boundary, my statement holds.

When mixing field types of different sizes, there will be different alignments of those fields on different platforms/compilers, whether or not bitfields are involved.

The layout can be portably controlled as desired, by being cognizant of field alignment:

```c
struct S {
     unsigned short x;
     unsigned short a : 12;  // at offset 2
     unsigned int   b : 12;  // at offset 4
     unsigned int   c : 8;   // at offset 4
};
```

```c
struct S {
     unsigned short x;
     unsigned short dummy;   // for alignment porpoises
     unsigned int a : 12;    // at offset 4
     unsigned int b : 12;    // at offset 4
     unsigned int c : 8;     // at offset 4
};
```

Simply put, avoiding fields that straddle alignment boundaries avoids portability issues. This is true with both bitfields and regular fields.
July 05
On 7/5/2024 2:25 PM, Richard (Rikki) Andrew Cattermole wrote:
> So now we also have to put it into an anonymous struct

See my other reply.

July 06

On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:

>

On 7/5/2024 2:12 PM, Steven Schveighoffer wrote:

>

I also tested the following, and found it too shows discrepancies.

struct S {
     unsigned short x;
     unsigned int a : 12;
     unsigned int b : 12;
     unsigned int c : 8;
};

The following will also show discrepancies:

struct T {
    unsigned short x;
    unsigned int y;
}

for the same reason.

I tested this struct, and there were no discrepancies between compilers. All compilers put 2 bytes of padding between the ushort and the uint.

>

It's actually based on the alignment of the preceding field. I'm regret not saying that, but that's what I meant with the fields need to be of the same type, so they have the same alignment. If the uint bitfield started off aligned at a uint boundary, my statement holds.

Hm..., well it's not ideal to require the user to nudge the compiler for the desired layout. It's an odd thing to say that a uint bitfield may not be uint aligned, even if the equivalent uint value would be.

The documentation note we talked about was simple -- just always use the same type for your bitfields and it works. This is different. Not impossible to learn, but for sure more challenging.

>

When mixing field types of different sizes, there will be different alignments of those fields on different platforms/compilers, whether or not bitfields are involved.

The confusing thing here is that the alignment does not obey the alignment of the containing type. And how it is aligned depends instead on the previous member (sometimes). This is not the case for full-sized uints.

I will note that I'm reading that ulong is aligned to 4-bytes on 32-bit linux, and so this does make an alignment difference even for non-bitfields.

My recommendation still is either:

  1. Denote D bitfields by a specified layout system (pick the most common C one and do that). C bitfields can match the C compiler.
  2. Simply forbid problematic alignments at compile time:
struct S {
   uint x;
   uint64 a : 24;
   uint64 b : 24;
   uint64 c : 16;
}

// error, alignment of bitfield `a` may not match C layout, please use padding or aligned bitfields to specify intended layout.

// these are OK.
struct SWithPadding {
   uint x;
   uint _; // padding
   uint64 a : 24;
   uint64 b : 24;
   uint64 c : 16;
}

struct SPacked {
   uint64 x : 32;
   uint64 a : 24;
   uint64 b : 24;
   uint64 c : 16;
}

Maybe the error only occurs if you specify a compiler switch?

-Steve

July 06
On 7/6/24 01:23, Walter Bright wrote:
> On 7/5/2024 10:02 AM, Timon Gehr wrote:
>> On 7/5/24 18:35, Walter Bright wrote:
>>>
>>> Consider also that the C standard does not specify the size of a 'char'.
>>
>> D does specify it.
> 
> Yes. And I have no concern at all about some C compiler that uses a different size. None of those C compilers will compile "portable" C code, either, even though the Standard permits such compilers.
> 
> If we go though a dimensional warp into an alternate universe, where C chars are 9 bits, we'll change the D compiler to match.

The point was: D should actually specify more bitfield layout guarantees than the C standard.
July 06
On 7/6/2024 8:54 AM, Timon Gehr wrote:
> The point was: D should actually specify more bitfield layout guarantees than the C standard.

I understand that. Given that any desired portable bitfield layout can be done with minimal effort, there is no need to add more semantics to the language than what C does.

I.e. portable not only to the associated C compiler, but to any C compiler with 8 bit chars and 32 bit ints.

Throw me an example that shows me wrong!

Personally, I would find this to be much more readable code than adding more syntactical constructs.
July 06
On 7/5/2024 8:23 PM, Steven Schveighoffer wrote:
> On Saturday, 6 July 2024 at 00:16:23 UTC, Walter Bright wrote:
>> The following will also show discrepancies:
>>
>> ```
>> struct T {
>>     unsigned short x;
>>     unsigned int y;
>> }
>> ```
>>
>> for the same reason.
> 
> I tested this struct, and there were no discrepancies between compilers. All compilers put 2 bytes of padding between the `ushort` and the `uint`.

Try it with a 16 bit compiler, which aligns on 16 bits rather than 32 bits.

No, I'm not cheating with this - I wanted to point out the consistency between 32 bit compilers, despite the Standard saying nothing about it. But I can still break the example, with a 32/64 bit compiler:

```
struct U {
    unsigned int x;
    unsigned long y;
}
```

You'll get different sizes for 32 vs 64 bit compilations, including with D.