July 08
On 7/7/2024 8:49 AM, Steven Schveighoffer wrote:
> Simple, no. Predictable, yes (it's unambiguous). And not obvious.

It is trivially obvious to the most casual observer!

Joking aside, it's the same technique used to inure a struct layout against member alignment issues.


> What I want is for the compiler to *require* you to do this to avoid inconsistencies. It is going to be a mystery to anyone reading it *why* they put these things in there.

I've seen fields named "pad" or "padding" many times in C code. It's normal practice. Failing that, the purpose of comments is to add the 'why'. One could also use `static assert` for extra insurance.

I've also seen fields named "reserved". No comment needed.


> To give some examples, we require empty if statements to use {} and not ;. It doesn't require any new syntax but it helps you avoid issues that many people make, even though it is allowed in C.

Then one could not write a C compatible bitfield.


> We require explicit conversion when narrowing the range of an integer (i.e. assigning a long to an int). This avoids issues that many people would make, even though it is allowed in C.

The C semantics are still allowed by adding a cast.

Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I know you've done some of this yourself! Bob doesn't want to go through it line by line. Isn't it nice for Bob if it "just works"? If all those data declarations just work? Especially if the result still has to be compatible with the files that C code wrote out?

But what if the compiler says "Bob, you can't lay out a bitfield like that!" Or worse, it lays out the bitfield into a portable (but different) layout. Then it doesn't just work, Bob has got some debugging to do (while Bob curses D and me), and Bob's got to figure out an alternative. Who wants to do that? Not Bob. Not me. Not nobody not nohow.


>> There are many existing ways to accomplish this. Adding more language features to duplicate existing capability needs a very strong case.
> I'm not asking for any new features.

Every switch that changes the semantics is a new feature and a new source of complexity and bugs.

One of my original requirements for D was no switches that change language semantics. I have failed at that. But I wasn't wrong to aspire towards it.

July 09

On Tuesday, 9 July 2024 at 00:29:20 UTC, Walter Bright wrote:

>

On 7/7/2024 8:49 AM, Steven Schveighoffer wrote:

>

Simple, no. Predictable, yes (it's unambiguous). And not obvious.

It is trivially obvious to the most casual observer!

Joking aside, it's the same technique used to inure a struct layout against member alignment issues.

Yes, but there is a subtle difference -- the compiler ignores its own rules. In other words, explicit padding is required way more than with normal fields, which have consistent layout expectations.

As Timon points out, the compiler doesn't obey its own alignment requirements for bitfields.

> >

What I want is for the compiler to require you to do this to avoid inconsistencies. It is going to be a mystery to anyone reading it why they put these things in there.

I've seen fields named "pad" or "padding" many times in C code. It's normal practice. Failing that, the purpose of comments is to add the 'why'. One could also use static assert for extra insurance.

I've also seen fields named "reserved". No comment needed.

I concede that this is probably true. This does rely on convention though, and having the compiler yell at you if you try to remove it is even better.

> >

To give some examples, we require empty if statements to use {} and not ;. It doesn't require any new syntax but it helps you avoid issues that many people make, even though it is allowed in C.

Then one could not write a C compatible bitfield.

Yes you can. You can use C to write a C compatible bitfield (ImportC is a thing).

If you are using C bitfields as part of an API, it's either to do register layout or protocol processing. In both of these cases, layout matters more than arbitrary implementation matching.

If you have a use case that relies on the arbitrariness of C bitfields (i.e. doesn't care), then yeah, I guess you have to go through ImportC. I don't see a problem with this -- this is almost always not public API (due to the problems with C bitfields). See for instance how the linux kernel doesn't use bitfields for anything other than internal flags to save space.

It's not something we need to cater to.

> >

We require explicit conversion when narrowing the range of an integer (i.e. assigning a long to an int). This avoids issues that many people would make, even though it is allowed in C.

The C semantics are still allowed by adding a cast.

The C bitfield layout is achievable with D as well, it just might be the same exact syntax. i.e. you may need to use a uint instead of unsigned long long, or you might need to insert padding.

>

Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I know you've done some of this yourself! Bob doesn't want to go through it line by line. Isn't it nice for Bob if it "just works"? If all those data declarations just work? Especially if the result still has to be compatible with the files that C code wrote out?

ImportC is a thing. Leave the bitfield structs defined in C until you are fully in D, then use D bitfields.

Or you modify your C code to use the recommended layouts that D uses. If you don't care about layout, it shouldn't be a problem. And the D port should tell you exactly which parts you need to change through the errors.

>

But what if the compiler says "Bob, you can't lay out a bitfield like that!" Or worse, it lays out the bitfield into a portable (but different) layout. Then it doesn't just work, Bob has got some debugging to do (while Bob curses D and me), and Bob's got to figure out an alternative. Who wants to do that? Not Bob. Not me. Not nobody not nohow.

This already happens, we don't need bitfields for this kind of pain. ImportC is the solution.

Note that this follows the rule "if it looks like C and compiles, it should act like C". It's OK for things not to compile because we decided they are too error prone.

> >

I'm not asking for any new features.

Every switch that changes the semantics is a new feature and a new source of complexity and bugs.

One of my original requirements for D was no switches that change language semantics. I have failed at that. But I wasn't wrong to aspire towards it.

How convenient that we draw the line here.

I have no rebuttal for this as it's totally arbitrary, so if this is your only qualm, I guess you got me.

-Steve

July 10
On 7/9/24 01:52, Walter Bright wrote:
> On 7/7/2024 3:42 AM, Timon Gehr wrote:
>> If it is simple, you should have no trouble stating how it works completely in a couple sentences.
> 
> One sentence:
> 
> If the bitfields of type T start on a T alignment boundary and do not straddle a T alignment boundary, then the bitfields will be portable.
> ...

Well, this is not a complete characterization, but good enough I guess.

So the preferred alignment of a bitfield of a given width is not portable? I.e., are there so-called sane C compilers where a `uint:16` has an (actual) alignment of 4 instead of 2?

> I agree I sometimes have trouble writing exact specifications, but I'm also confident that you understand this.
> ...

Sure, but I really think we should just enforce this kind of rule for `extern(D)` bitfields. If a programmer does not follow the rule, just error out and present options to the programmer for how to make the code compile:

error: bitfield layout is ambiguous

- add extern(C) to match the layout of the associated C compiler
- add padding and/or 0-width bitfields to unambiguously start bitfields on a T alignment boundary without straddling

A priori you just don't know which of those was intended. It's good to require explicit input here, as it is subtle.

> 
>> I am as a result now not sure whether what you stated is the full truth, or it is still some inadmissible simplification that glosses over some further dragons.
> 
> Feel free to try pathological examples and let me know of any adverse discoveries.
> 
> 
>> Also, I hope `.offsetof % .alignof != 0` is just a bug in your bitfield implementation.
> 
> ??
> 

It's elaborated upon in the part of the post you ignored:

On 7/7/24 12:42, Timon Gehr wrote:
> 
> Also consider this:
> 
> ```d
> struct S{
>      uint x;
>      ulong y:30;
>      ulong z:34;
> }
> pragma(msg, S.y.offsetof, " ", S.y.alignof); // 4LU 8LU
> 
> The offset of `y` does not even respect its alignment! This is insanity.
> 
> It also happens with `uint`:
> 
> ```d
> struct S{
>      ushort x;
>      uint y:16;
> }
> pragma(msg, S.y.offsetof, " ", S.y.alignof); // 2LU 4LU 

July 10
On 7/9/24 02:29, Walter Bright wrote:
> 
> Let's say Bob (poor Bob) needs to convert 20,000 lines of C code to D. I know you've done some of this yourself! Bob doesn't want to go through it line by line. Isn't it nice for Bob if it "just works"?

It won't, some edits will be necessary.

> If all those data declarations just work? Especially if the result still has to be compatible with the files that C code wrote out?
> 
> But what if the compiler says "Bob, you can't lay out a bitfield like that!"

The compiler should simply say: "Bob, are you sure you want to lay out a bitfield like this?" If Bob is comfortable with it, he can add `extern(C)` and move on.

> Or worse, it lays out the bitfield into a portable (but different) layout.

Well I think this is not an option.

> Then it doesn't just work, Bob has got some debugging to do (while Bob curses D and me), and Bob's got to figure out an alternative. Who wants to do that? Not Bob. Not me. Not nobody not nohow.

As far as I am concerned, this is an irrelevant straw man. I don't want this. I never suggested anything that would cause this. It's pure FUD.

Similarly, I don't want to go chasing down subtle differences in behavior/cache performance etc. between platforms. Portability may be important. It shouldn't be insane by default, it should be insane by choice. Informed consent.

Especially given that bitfields have a "much nicer syntax" than alternative approaches. It's not nice to hand out a footgun disguised as candy.

July 10
On 7/10/24 02:32, Timon Gehr wrote:
> 
> error: bitfield layout is ambiguous
> 
> - add extern(C) to match the layout of the associated C compiler
> - add padding and/or 0-width bitfields to unambiguously start bitfields on a T alignment boundary without straddling

Or change some of the bitfield types to ones with smaller alignment I guess. (If that is necessary at all. It's still not so obvious exactly what assumptions are portable in practice.)
July 10
On 7/10/24 02:44, Timon Gehr wrote:
> 
> Especially given that bitfields have a "much nicer syntax" than alternative approaches. It's not nice to hand out a footgun disguised as candy.

Maybe check out this guy's take on this kind of thing:
https://youtu.be/3iWn4S8JV8g

We should take it to heart.
July 10

On Wednesday, 10 July 2024 at 00:32:53 UTC, Timon Gehr wrote:

>

On 7/9/24 01:52, Walter Bright wrote:

>

I agree I sometimes have trouble writing exact specifications, but I'm also confident that you understand this.
...

Sure, but I really think we should just enforce this kind of rule for extern(D) bitfields. If a programmer does not follow the rule, just error out and present options to the programmer for how to make the code compile:

error: bitfield layout is ambiguous

  • add extern(C) to match the layout of the associated C compiler
  • add padding and/or 0-width bitfields to unambiguously start bitfields on a T alignment boundary without straddling

A priori you just don't know which of those was intended. It's good to require explicit input here, as it is subtle.

Yes, this is the correct answer. I stayed away from extern(C) specification because I kinda see the point that we have no precedent for extern(C) to adjust field layout. But this seems so obvious to me, I challenge anyone to fault this as a bad experience. For those who want C Compatibility, just say so. The D compiler has you covered. For those who want exact bitfield layout, you can use D, because D ensures you have not shot yourself in the foot by making an ambiguous layout request.

-Steve

July 10
On 7/10/24 03:41, Steven Schveighoffer wrote:
> I stayed away from `extern(C)` specification because I *kinda* see the point that we have no precedent for `extern(C)` to adjust field layout.

Well, it does affect layout:

```d
extern(C) struct S{}
pragma(msg, S.sizeof); // 0LU
pragma(msg, (S[100]).sizeof); // 0LU

struct T{}
pragma(msg, T.sizeof); // 1LU
pragma(msg, (T[100]).sizeof); // 100LU
```

In any case, here, the usage is a bit different, in that the `extern(D)` version would just be a bit more restrictive, but still fully compatible.
July 09
On 7/9/2024 5:32 PM, Timon Gehr wrote:
>> The offset of `y` does not even respect its alignment! This is insanity.

That's right. It's not a bug, it matches what the associated C compiler does. It's the same thing as Steven pointed out. I posted how to portably get either arrangement.

July 09
On 7/9/2024 6:57 PM, Timon Gehr wrote:
> ```d
> extern(C) struct S{}
> pragma(msg, S.sizeof); // 0LU
> pragma(msg, (S[100]).sizeof); // 0LU
> 
> struct T{}
> pragma(msg, T.sizeof); // 1LU
> pragma(msg, (T[100]).sizeof); // 100LU
> ```
> 
> In any case, here, the usage is a bit different, in that the `extern(D)` version would just be a bit more restrictive, but still fully compatible.

C and C++ differ here, too. D defaults to the C++ route because they wanted distinct objects to have distinct addresses, which made sense to me.