second draft: add Bitfields to D

second draft: add Bitfields to D
Apr 23 Walter Bright
Apr 23 Richard (Rikki) Andrew Cattermole
Apr 26 Steven Schveighoffer
Apr 27 Jonathan M Davis
Apr 28 Walter Bright
Apr 28 Jonathan M Davis
Apr 28 Richard (Rikki) Andrew Cattermole
Apr 28 Walter Bright
Apr 28 Walter Bright
Apr 28 Adam Wilson
Apr 28 Jonathan M Davis
Apr 28 Timon Gehr
Apr 28 Timon Gehr
Apr 29 Walter Bright
Apr 29 Timon Gehr
Apr 29 Timon Gehr
Apr 30 Walter Bright
Apr 29 Jonathan M Davis
Apr 30 Walter Bright
Apr 30 Walter Bright
Apr 30 Timon Gehr
May 04 Walter Bright
May 04 Timon Gehr
Apr 30 Walter Bright
Apr 30 Timon Gehr
Apr 30 Jonathan M Davis
Apr 30 Timon Gehr
May 01 Jonathan M Davis
May 04 Walter Bright
May 04 Walter Bright
May 04 Timon Gehr
May 04 Timon Gehr
May 04 Timon Gehr
May 04 Walter Bright
May 04 Timon Gehr
May 05 Walter Bright
May 05 Timon Gehr
May 06 Walter Bright
May 06 Timon Gehr
May 07 Walter Bright
May 07 Timon Gehr
May 03 Patrick Schluter
May 03 user1234
May 03 user1234
May 04 Walter Bright
Apr 29 Jonathan M Davis
May 04 Richard (Rikki) Andrew Cattermole
May 04 Walter Bright
May 06 Per Nordlöw
May 06 Walter Bright
May 21 Mike Parker

April 22

Posted by Walter Bright

Permalink

Walter Bright

Permalink

https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

April 23

Re: second draft: add Bitfields to D

Posted by Richard (Rikki) Andrew Cattermole
in reply to Walter Bright

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Walter Bright

Permalink

On 23/04/2024 1:01 PM, Walter Bright wrote:
> https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

"The specific layout of bitfields in C is implementation-defined, and varies between the Digital Mars, Microsoft, and gdc/ldc compilers. gdc/lcd are lumped together because they produce identical results on the same platform."

s/lcd/ldc/



Worth mentioning here is that as long as you don't use string mixins attempting semantic is actually pretty cheap to determine compilability. Now that I'm thinking about the fact that its the same entry point internally.

Not ideal, will need an example in the specification on how to do this, if there is no trait. But in saying that, you'll need to use a trait anyway, so...

```d
T t;
enum isNotBitField = !__traits(compiles, &__traits(getMember, t, member));
```

Not ideal.

```d
void main() {
    Foo t;
	enum isBitField = !__traits(compiles, &__traits(getMember, t, "member"));
    pragma(msg, isBitField);
}

struct Foo {
    enum member;
}
```

Okay yes, not having the trait is a bad idea.
It makes introspection capabilities of D have less capability to determine what a symbol is.



I also mentioned this previously, but I want to see std.bitmap.bitfields gone for PhobosV3.

Anything that uses string mixins that the user interacts with makes tooling fail with it.

This is not an acceptable solution to be recommending to people, we can do significantly better than that.

It also means that people have to remember and understand the two separate solutions that we are recommending that in no way are comparable in how they are implemented.

April 26

Re: second draft: add Bitfields to D

Posted by Steven Schveighoffer
in reply to Walter Bright

Permalink

Steven Schveighoffer

Posted in reply to Walter Bright

Permalink

On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:

https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

Suffers from the same major problem as last time - nobody is going to be using C bitfield structs from D, yet we are inheriting all the problems. Keeping C compatibility is meaningless. We should pick one way and do it that way for D bitfields.

Have you considered that people might build some libraries with ldc, but build applications with dmd? If LDC picks one mechanism for laying out bitfields, but DMD picks a different one, then what happens when you try to use the two together? Do we really want to make D incompatible with itself?

This already happens with C. See for instance https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-cl

Adding more __traits is trivial, don't skimp here.

Still does not address sizeof.

The mechanism described to get the bit offset is... horrific. Please just add some __traits.

-Steve

April 27

Re: second draft: add Bitfields to D

Posted by Jonathan M Davis
in reply to Steven Schveighoffer

Permalink

Jonathan M Davis

Posted in reply to Steven Schveighoffer

Permalink

On Friday, April 26, 2024 9:26:06 AM MDT Steven Schveighoffer via dip.development wrote:
> On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
> > https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05 7774d981a5bf7/bitfields.md
> Suffers from the same major problem as last time - nobody is going to be using C bitfield structs from D, yet we are inheriting all the problems. Keeping C compatibility is meaningless. We should pick one way and do it that way for D bitfields.

C compatability matters a lot for importC and for C bindings in general - not that we have to have a bitfields feature for general D which matches that, but if we don't have a way to match what C does, then we have trouble creating bindings for C code that uses bitfields. extern(C++) code potentially needs the same thing.

Personally, binding to C is the primary way that I've ever had to deal with bitfields, and not having the ability to do that has made dealing with such bindings... interesting.

Now, if we want to do something like have extern(C) bitfields and extern(D) bitfields so that we can have clean and consistent behavior in normal D code, I'm perfectly fine with that, but I don't agree at all that binding to C doesn't matter. For me at least, that's the primary place that bitfields matter, particularly since I can use other solutions in D if need be, whereas if a C API is designed to use bitfields, then you kind of need support for that in D if you want the bindings to work correctly.

> Have you considered that people might build some libraries with ldc, but build applications with dmd? If LDC picks one mechanism for laying out bitfields, but DMD picks a different one, then what happens when you try to use the two together? Do we really want to make D incompatible with itself?

Completely aside from this specific issue, isn't it already the case that you can't mix code built with different D compilers? I didn't think that there was any guarantee of ABI compatibility across compilers, and I would fully expect there to be trouble if I built parts of my code with one compiler and other parts with another. I typically get linker errors at work if I fail to clean out the build files when switching between dmd and ldc.

- Jonathan M Davis

April 28

Re: second draft: add Bitfields to D

Posted by Adam Wilson
in reply to Walter Bright

Permalink

Adam Wilson

Posted in reply to Walter Bright

Permalink

On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:

https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

I would approve this because we gain C compatibility and we can drop the std.bitmanip.bitfields type entirely from Phobos 3.

April 27

Re: second draft: add Bitfields to D

Posted by Jonathan M Davis
in reply to Adam Wilson

Permalink

Jonathan M Davis

Posted in reply to Adam Wilson

Permalink

On Saturday, April 27, 2024 6:31:37 PM MDT Adam Wilson via dip.development wrote:
> On Tuesday, 23 April 2024 at 01:01:11 UTC, Walter Bright wrote:
> > https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa05 7774d981a5bf7/bitfields.md
> I would approve this because we gain C compatibility and we can drop the `std.bitmanip.bitfields` type entirely from Phobos 3.

Actually, it doesn't fix the need for std.bitmanip.bitfields, though it does reduce it. Use cases that need a guaranteed layout (e.g. for serialization) won't work with C-compatible bitfields, because the layout could change depending on the target platform. So adding this feature to the language doesn't help them at all, and they'd still need something like the Phobos solution.

Of course, this DIP helps quite a bit with regards to C bindings (which the Phobos solution does not help with), because those cases need to match the C layout rather than guaranteeing a layout that will be the same across all OSes and architectures. This DIP could also be used in cases where you don't care what C is doing, but you also don't care exactly how the bitfields are laid out. So, it would reduce the need for a Phobos solution, but it doesn't replace it.

- Jonathan M Davis

April 27

Re: second draft: add Bitfields to D

Posted by Walter Bright
in reply to Steven Schveighoffer

Permalink

Walter Bright

Posted in reply to Steven Schveighoffer

Permalink

On 4/26/2024 8:26 AM, Steven Schveighoffer wrote:
> Suffers from the same major problem as last time - nobody is going to be using C bitfield structs from D,

I am, as soon as they become available in the D bootstrap compiler. I don't much care for the ugly workarounds used currently.

> yet we are inheriting all the problems.

There aren't any problems if one is using bitfields for reducing memory consumption or for C compatibility.

> Keeping C compatibility is meaningless.

In the D compiler source code, it means gcd and ldc with their C++ backends won't have any issues with it.

> Have you considered that people might build some libraries with ldc, but build applications with dmd? If LDC picks one mechanism for laying out bitfields, but DMD picks a different one, then what happens when you try to use the two together? Do we really want to make D incompatible with itself?

I have considered that. dmd will pick the same layout as the associated C compiler, which is gcc (used by gdc), and clang (used by ldc).

> This already happens with C. See for instance https://stackoverflow.com/questions/43504113/bitfield-struct-size-different-between-gcc-and-msft-cl

Can you even mix/match object files between vc and gdc, or vc and ldc, anyway?

dmd on Windows generates DMC layout for -m32, and VC layout for -m64 and -m32mscoff

> Adding more `__traits` is trivial, don't skimp here.

Can be added later. The point is, the information is available.

> Still does not address `sizeof`.

Oops forgot that. It would return the size of the bitfield's type.

> The mechanism described to get the bit offset is... horrific. Please just add some `__traits`.

It can be added later. But in general it is not a good idea to add things that are deducible from existing things. In this case, it's a loop. A function could be written to do it.

April 27

Re: second draft: add Bitfields to D

Posted by Walter Bright
in reply to Jonathan M Davis

Permalink

Walter Bright

Posted in reply to Jonathan M Davis

Permalink

On 4/27/2024 12:12 AM, Jonathan M Davis wrote:
> Now, if we want to do something like have extern(C) bitfields and extern(D)
> bitfields so that we can have clean and consistent behavior in normal D
> code
D used to have its own function call ABI, because I thought I'd make a clean and consistent one.

It turned out, nobody cared about clean and consistent. They wanted C compatibility. For example, debuggers could not handle anything other than what the associated C compiler emitted, regardless of what the debug info spec says.

There really is not a clean and consistent layout. There is only C compatibility. Just like we do for endianess and alignment.

All of the portability issues people have mentioned are easily dealt with.

There is always writing functions that do shifts and masks as a last resort. (Shifts and masks is what the code generator does anyway, so this won't cost any performance.)

April 28

Re: second draft: add Bitfields to D

Posted by Jonathan M Davis
in reply to Walter Bright

Permalink

Jonathan M Davis

Posted in reply to Walter Bright

Permalink

On Sunday, April 28, 2024 12:44:41 AM MDT Walter Bright via dip.development wrote:
> On 4/27/2024 12:12 AM, Jonathan M Davis wrote:
> > Now, if we want to do something like have extern(C) bitfields and
> > extern(D)
> > bitfields so that we can have clean and consistent behavior in normal D
> > code
>
> D used to have its own function call ABI, because I thought I'd make a clean and consistent one.
>
> It turned out, nobody cared about clean and consistent. They wanted C compatibility. For example, debuggers could not handle anything other than what the associated C compiler emitted, regardless of what the debug info spec says.
>
> There really is not a clean and consistent layout. There is only C compatibility. Just like we do for endianess and alignment.
>
> All of the portability issues people have mentioned are easily dealt with.
>
> There is always writing functions that do shifts and masks as a last resort. (Shifts and masks is what the code generator does anyway, so this won't cost any performance.)

In this particular case, as I understand it, there are use cases that definitely need to be able to have a guaranteed bit layout (e.g. serialization code). So, I don't think that this is quite the same situation as something like the call ABI. Even if a particular call ABI might theoretically be better, it's not something that code normally cares about in practice so long as it works, whereas some code will actually care what the exact layout of bitfields is. The call ABI is largely a language implementation detail, whereas the layout of bitfields actually affects the behavior of the code.

It seems to me that we're dealing with three use cases here:

1. Code that is specifically binding to C bitfields. It needs to match what the C compiler does, or it won't work. That comes with whatever pros and cons the C layout has, but since the D code needs to match the C layout to work, we just have to deal with whatever the layout is, and realistically, the D code using it should not be written to care what the layout is, because it could differ across OSes and architectures.

2. Code that needs a guaranteed bit layout, because it's actually taking the integers that the bitfields are compacted into and storing them elsewhere (e.g. storing the data on disk or sending it across the network). What C does with bitfields for such code is utterly irrelevant, and it's undesirable to even attempt compatibility. The bits need to be laid out precisely in the way that the programmer indicates.

3. Code that just wants to store bits in a compact manner, and how that's done doesn't particularly matter as long as the code just operates on the individual bitfields and doesn't actually do anything with the integer values that they're compacted into where the layout would matter.

For the third use case, it's arguably the case that we'd be better off with a guaranteed bit layout so that it would be consistent across OSes and architectures, and anyone who accidentally wrote code that relied on the bit layout wouldn't have issues as a result (similar to how we make it so that long is guaranteed to be 64 bits across OSes and architectures regardless of what C does; we avoid whole classes of bugs that way). If I understand correctly, it's the issues that come from accidentally relying on the exact bit layout when it's not guaranteed which are why folks like Steven are arguing that it's a terrible idea to follow C's layout.

However, it's also true that since such code in theory doesn't care what the bit layout is (since it's just using bitfields for compact storage and not for something like serialization), the third use case could be solved with either C-compatible bitfields or with bitfields which have a guaranteed layout. It would be less error-prone (and thus more desirable) if the bit layout were consistent, but as long as code doesn't accidentally depend on the layout, it shouldn't matter.

So, use case #3 could be solved with either C-compatible bitfields or bitfields with a guaranted layout. However, use cases #1 and #2 are completely incompatible, and we therefore need separate solutions for them.

For C compatibility, the obvious solution is to have the compiler deal with it like this DIP is doing. It already has to deal with C compatibility for a variety of things, and it's just going to be far easier and cleaner to have the compiler set up to provide C-compatible bitfields than it is to try to provide a library solution. I wouldn't expect a library solution to cover all of the possible targets correctly, whereas it should be much more straightforward for the compiler to do it.

The issue then is what to do about use case #2, where the bit layout needs to be guaranteed.

I get the impression that you favor leaving the guaranteed bit layout to a library solution, since you don't think that that use case matters much, whereas you think that C compatibility matters a great deal, and you don't think that the issues with accidentally relying on the layout when it's not guaranteed are a big enough concern to avoid using C bitfields for code that just wants to compact the bits. On the other hand, a number of the folks in this thread don't think that C compatibility matters and don't want the bugs that come from accidentally relying on the bit layout when it's not guaranteed, so they're arguing for just making our bitfields have a guaranteed layout and not worrying about C.

Personally, I'm inclined to argue that it would just be better to treat this like we do extern(C++). extern(C++) structs and classes have whatever tweaks are necessary to make them work with C++, whereas extern(D) code does what we want to do with D types. We can do the same with extern(C) bitfields and extern(D) bitfields. That way, we get C compatibility for the code that needs it and a guaranteed bit layout for the code that needs that. And since the guaranteed layout would be the default, we'd largely avoid bugs related to relying on the bit layout when it's not guaranteed. It would be like how D code in general uses long rather than c_long, so normal D code can rely on the size of long and avoid the bugs that come with the type's size varying depending on the target, whereas the code that actually needs C compatibility uses c_long and takes the risks that come with a variable integer size, because it has to. The issues with C bitfields would be restricted to the code that actually needs the compatibility. It would also make it cleaner to write code that has a guaranteed bit layout than it would be a with a library solution, since it could use the nice syntax too rather than treating it as a second-class citizen.

However, in terms of what's actually necessary, I think that realistically, extern(C) bitfields need to be in the language like this DIP is proposing, since it's just too risky to do that with a library solution, whereas extern(D) bitfields _can_ be solved with a library solution like they are right now. I don't think that that's the best solution, but it's certainly better than what we have right now, since we don't have C-compatible bitfields anywhere at the moment (outside of a preview switch).

In any case, it seems like the core issue that's resulting in most of the debate over this DIP is how important some people think that it is to have a guaranteed bit layout by default so that bugs which come from relying on a layout that isn't guaranteed will be avoided. You don't seem to think that that's much of a concern, whereas some of the other folks think that it's a big concern.

Either way, I completely agree that we need a C-compatible solution in the language so that we can sanely bind to C code that uses bitfields.

- Jonathan M Davis

April 29

Re: second draft: add Bitfields to D

Posted by Richard (Rikki) Andrew Cattermole
in reply to Jonathan M Davis

Permalink

Richard (Rikki) Andrew Cattermole

Posted in reply to Jonathan M Davis

Permalink

On 29/04/2024 1:32 AM, Jonathan M Davis wrote:
> In any case, it seems like the core issue that's resulting in most of the
> debate over this DIP is how important some people think that it is to have a
> guaranteed bit layout by default so that bugs which come from relying on a
> layout that isn't guaranteed will be avoided. You don't seem to think that
> that's much of a concern, whereas some of the other folks think that it's a
> big concern.

I'm not sure that anyone cares what the default is.

For the most part you're in use case #3 by default, its only if you're dealing with a binding or serialization that you care and each of those are specialized enough to opt-in to whatever strategy is appropriate.

But one thing that has been on my kill list for PhobosV3 is string mixins publicly introducing any new symbols like... bitfields.

Simply because auto-completion cannot see it, and may never be able to see it due to the CTFE requirement.

https://github.com/LightBender/PhobosV3-Design/discussions/32

Top | Forum index | About this forum

Forums