Jonathan M Davis
Posted in reply to Walter Bright
| On Sunday, April 28, 2024 12:44:41 AM MDT Walter Bright via dip.development wrote:
> On 4/27/2024 12:12 AM, Jonathan M Davis wrote:
> > Now, if we want to do something like have extern(C) bitfields and
> > extern(D)
> > bitfields so that we can have clean and consistent behavior in normal D
> > code
>
> D used to have its own function call ABI, because I thought I'd make a clean and consistent one.
>
> It turned out, nobody cared about clean and consistent. They wanted C compatibility. For example, debuggers could not handle anything other than what the associated C compiler emitted, regardless of what the debug info spec says.
>
> There really is not a clean and consistent layout. There is only C compatibility. Just like we do for endianess and alignment.
>
> All of the portability issues people have mentioned are easily dealt with.
>
> There is always writing functions that do shifts and masks as a last resort. (Shifts and masks is what the code generator does anyway, so this won't cost any performance.)
In this particular case, as I understand it, there are use cases that definitely need to be able to have a guaranteed bit layout (e.g. serialization code). So, I don't think that this is quite the same situation as something like the call ABI. Even if a particular call ABI might theoretically be better, it's not something that code normally cares about in practice so long as it works, whereas some code will actually care what the exact layout of bitfields is. The call ABI is largely a language implementation detail, whereas the layout of bitfields actually affects the behavior of the code.
It seems to me that we're dealing with three use cases here:
1. Code that is specifically binding to C bitfields. It needs to match what the C compiler does, or it won't work. That comes with whatever pros and cons the C layout has, but since the D code needs to match the C layout to work, we just have to deal with whatever the layout is, and realistically, the D code using it should not be written to care what the layout is, because it could differ across OSes and architectures.
2. Code that needs a guaranteed bit layout, because it's actually taking the integers that the bitfields are compacted into and storing them elsewhere (e.g. storing the data on disk or sending it across the network). What C does with bitfields for such code is utterly irrelevant, and it's undesirable to even attempt compatibility. The bits need to be laid out precisely in the way that the programmer indicates.
3. Code that just wants to store bits in a compact manner, and how that's done doesn't particularly matter as long as the code just operates on the individual bitfields and doesn't actually do anything with the integer values that they're compacted into where the layout would matter.
For the third use case, it's arguably the case that we'd be better off with a guaranteed bit layout so that it would be consistent across OSes and architectures, and anyone who accidentally wrote code that relied on the bit layout wouldn't have issues as a result (similar to how we make it so that long is guaranteed to be 64 bits across OSes and architectures regardless of what C does; we avoid whole classes of bugs that way). If I understand correctly, it's the issues that come from accidentally relying on the exact bit layout when it's not guaranteed which are why folks like Steven are arguing that it's a terrible idea to follow C's layout.
However, it's also true that since such code in theory doesn't care what the bit layout is (since it's just using bitfields for compact storage and not for something like serialization), the third use case could be solved with either C-compatible bitfields or with bitfields which have a guaranteed layout. It would be less error-prone (and thus more desirable) if the bit layout were consistent, but as long as code doesn't accidentally depend on the layout, it shouldn't matter.
So, use case #3 could be solved with either C-compatible bitfields or bitfields with a guaranted layout. However, use cases #1 and #2 are completely incompatible, and we therefore need separate solutions for them.
For C compatibility, the obvious solution is to have the compiler deal with it like this DIP is doing. It already has to deal with C compatibility for a variety of things, and it's just going to be far easier and cleaner to have the compiler set up to provide C-compatible bitfields than it is to try to provide a library solution. I wouldn't expect a library solution to cover all of the possible targets correctly, whereas it should be much more straightforward for the compiler to do it.
The issue then is what to do about use case #2, where the bit layout needs to be guaranteed.
I get the impression that you favor leaving the guaranteed bit layout to a library solution, since you don't think that that use case matters much, whereas you think that C compatibility matters a great deal, and you don't think that the issues with accidentally relying on the layout when it's not guaranteed are a big enough concern to avoid using C bitfields for code that just wants to compact the bits. On the other hand, a number of the folks in this thread don't think that C compatibility matters and don't want the bugs that come from accidentally relying on the bit layout when it's not guaranteed, so they're arguing for just making our bitfields have a guaranteed layout and not worrying about C.
Personally, I'm inclined to argue that it would just be better to treat this like we do extern(C++). extern(C++) structs and classes have whatever tweaks are necessary to make them work with C++, whereas extern(D) code does what we want to do with D types. We can do the same with extern(C) bitfields and extern(D) bitfields. That way, we get C compatibility for the code that needs it and a guaranteed bit layout for the code that needs that. And since the guaranteed layout would be the default, we'd largely avoid bugs related to relying on the bit layout when it's not guaranteed. It would be like how D code in general uses long rather than c_long, so normal D code can rely on the size of long and avoid the bugs that come with the type's size varying depending on the target, whereas the code that actually needs C compatibility uses c_long and takes the risks that come with a variable integer size, because it has to. The issues with C bitfields would be restricted to the code that actually needs the compatibility. It would also make it cleaner to write code that has a guaranteed bit layout than it would be a with a library solution, since it could use the nice syntax too rather than treating it as a second-class citizen.
However, in terms of what's actually necessary, I think that realistically, extern(C) bitfields need to be in the language like this DIP is proposing, since it's just too risky to do that with a library solution, whereas extern(D) bitfields _can_ be solved with a library solution like they are right now. I don't think that that's the best solution, but it's certainly better than what we have right now, since we don't have C-compatible bitfields anywhere at the moment (outside of a preview switch).
In any case, it seems like the core issue that's resulting in most of the debate over this DIP is how important some people think that it is to have a guaranteed bit layout by default so that bugs which come from relying on a layout that isn't guaranteed will be avoided. You don't seem to think that that's much of a concern, whereas some of the other folks think that it's a big concern.
Either way, I completely agree that we need a C-compatible solution in the language so that we can sanely bind to C code that uses bitfields.
- Jonathan M Davis
|