second draft: add Bitfields to D (page 2) - D Programming Language Discussion Forum

Forums

New users
- Learn
Community
- General
- Announce
Improvements
- DIP Ideas
- DIP Devel.
Ecosystem
- GDC
- LDC
- Debuggers
- IDEs
- DWT
Development
- Internals
- Issues
- Beta
- DMD
- Phobos
- Druntime
- Study
Turkish
- Genel
- Duyuru

Index » DIP Development » second draft: add Bitfields to D (page 2)

April 28

Re: second draft: add Bitfields to D

Posted by Walter Bright
in reply to Jonathan M Davis

Walter Bright

Posted in reply to Jonathan M Davis

I listed these 3 use cases in the second draft, and it seems we are mostly in agreement.

Using bit fields to reduce memory consumption, and to be compatible with C code, is handled by default nicely with the proposal.

Conformance to an externally imposed layout sometimes is necessary, but it is much less common. It is almost always easily done with a minor bit of attention. The worst case is writing a shift/mask accessor function, very easy to do. I suspect these workarounds are even less effort than reading the spec on how to use special syntax for it. Nobody is obliged to use std.bitmanip.bitfield to get the job done.

I can help with any externally defined format anyone is having difficulty with.

April 29

Re: second draft: add Bitfields to D

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 4/23/24 03:01, Walter Bright wrote:
> https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md

> In practice, however, if one sticks to int, uint, long and ulong bitfields, they are laid out the same.

Maybe only those cases should be allowed without `extern(C)`. I think that might be an ok compromise.

However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.

> Symbolic Debug Info

This does not seem like a strong argument. I am pretty confident debug info can work pretty well regardless of how D lays out the bits.

> ["a", "b", "c"]
> ["a", "_b_c_d_bf", "b", "b_min", "b_max", "c", "c_min", "c_max", "d", "d_min", "d_max"]

I like that the members are not as cluttered. I guess maybe some people still would like to access the underlying data (e.g., to implement a pointer to bitfield as a struct with a pointer plus bit offset and bit length, or something), so perhaps you could add a note that explains how to do that.

You forgot to say what `.tupleof` will do for a struct with bitfields in it.

> There isn't a specific trait for "is this a bitfield".

I think it would be better to have such a `__traits` even just for discoverability when people look at the `__traits` page to implement some introspection code.

> testing to see if the address of a field can be taken, enables discovery of a bitfield.

Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.

> The values of .max or .min enable determining the number of bits in a bitfield.

I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?

> 
> The bit offset can be introspected by summing the number of bits in each preceding bitfield that has the same value of .offsetof.

I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.

April 29

Re: second draft: add Bitfields to D

Posted by Timon Gehr
in reply to Timon Gehr

Timon Gehr

Posted in reply to Timon Gehr

On 4/29/24 00:30, Timon Gehr wrote:
> On 4/23/24 03:01, Walter Bright wrote:
>> https://github.com/WalterBright/documents/blob/dcb8caabff76312eee25bb9aa057774d981a5bf7/bitfields.md
> ...
> 
> However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.
> ...

This also renders somewhat moot the following claims from the DIP:

> This is an additive feature and does not break any existing code. Its use is entirely optional.

I get that combinations of code that exist today won't break, but it still does break libraries that do "just works" serialization if new code uses that library with bitfields, and the breakage might be silent.

April 28

Re: second draft: add Bitfields to D

Posted by Walter Bright
in reply to Timon Gehr

Walter Bright

Posted in reply to Timon Gehr

On 4/28/2024 3:30 PM, Timon Gehr wrote:
> However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.

I doubt introspection libraries would break. If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate).

>> Symbolic Debug Info
> 
> This does not seem like a strong argument. I am pretty confident debug info can work pretty well regardless of how D lays out the bits.

I'm not. I'd follow the dwarf spec and it didn't work, because the only thing that was ever tested was apparently what the C compiler actually did. In order to get gdb to work, I wound up ignoring the spec and doing what gcc did. It's the same with object file formats. The spec is somewhat of a fairy tale, it's what the associated C compiler actually does that matters.

> I like that the members are not as cluttered. I guess maybe some people still would like to access the underlying data (e.g., to implement a pointer to bitfield as a struct with a pointer plus bit offset and bit length, or something), so perhaps you could add a note that explains how to do that.

Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.

> You forgot to say what `.tupleof` will do for a struct with bitfields in it.

They do exactly what you'd expect them to do:

```
import std.stdio;
struct S { int a:4, b:5; }
void main()
{
    S s;
    s.a = 7;
    s.b = 9;
    writeln(s.tupleof);
}
```
prints:
```
79
```
It's not necessary to specify this, because this behavior does not diverge from field access semantics. Only things that differ need to be specified. Specifying "it works like X except for A,B,C" is a lot more reliable and compact than reiterating everything X does.

> I think it would be better to have such a `__traits` even just for discoverability when people look at the `__traits` page to implement some introspection code.

There isn't for other members, it's just "allMembers".

>> testing to see if the address of a field can be taken, enables discovery of a bitfield.
> 
> Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.

An enum is distinguished by it not being possible to use .offsetof with it.

>> The values of .max or .min enable determining the number of bits in a bitfield.
> I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?

I agree it's a bit(!) jarring at first blush, but it's easy and perfectly reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of introspection via indirect things like this.

>> The bit offset can be introspected by summing the number of bits in each preceding bitfield that has the same value of .offsetof.
> 
> I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.

I overlooked that bitfields can have holes in them, so probably something like .bitoffsetof is probably necessary.

April 29

Re: second draft: add Bitfields to D

Posted by Timon Gehr
in reply to Walter Bright

Timon Gehr

Posted in reply to Walter Bright

On 4/29/24 08:44, Walter Bright wrote:
> On 4/28/2024 3:30 PM, Timon Gehr wrote:
>> However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.
> 
> I doubt introspection libraries would break.

You are breaking even simple patterns like
`foreach(ref field;s.tupleof){ }`.

It would be a miracle if libraries did not break.

> If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate).
> ...

No, it is not accurate.

> ...
>> I like that the members are not as cluttered. I guess maybe some people still would like to access the underlying data (e.g., to implement a pointer to bitfield as a struct with a pointer plus bit offset and bit length, or something), so perhaps you could add a note that explains how to do that.
> 
> Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.
> ...

Well, you can't take a pointer to a bitfield.


>> You forgot to say what `.tupleof` will do for a struct with bitfields in it.
> 
> They do exactly what you'd expect them to do:
> 
> ```
> import std.stdio;
> struct S { int a:4, b:5; }
> void main()
> {
>      S s;
>      s.a = 7;
>      s.b = 9;
>      writeln(s.tupleof);
> }
> ```
> prints:
> ```
> 79
> ```
> It's not necessary to specify this,

Well, so far everything in `.tupleof` had an address.
It should at least be mentioned in the DIP, if nowhere else you should put it in the breaking language changes section.

> because this behavior does not diverge from field access semantics.

There is a difference between a DIP (that can change the language) and the specification (that can indeed be written in a way that does not explicitly mention bitfields under the `.tupleof` documentation.)

> ...
> 
>> I think it would be better to have such a `__traits` even just for discoverability when people look at the `__traits` page to implement some introspection code.
> 
> There isn't for other members, it's just "allMembers".
> ...

Despite not being very relevant to what I was asking for, this is simply untrue. `allMembers` gives you the members, and `.tupleof` gives you the fields.

> 
>>> testing to see if the address of a field can be taken, enables discovery of a bitfield.
>>
>> Not really, a field could be an `enum` field, and you cannot take the address of that either. And if we ever add another feature that has fields whose address can be taken, existing introspection code may break. It is better to be explicit.
> 
> An enum is distinguished by it not being possible to use .offsetof with it.
> ...

Well, if you are trying to deliberately make introspection unnecessarily complicated, I guess that's your prerogative.

> 
>>> The values of .max or .min enable determining the number of bits in a bitfield.
>> I do not like this a lot, it does not seem like the canonical way to determine it. `.bitlength`?
> 
> I agree it's a bit(!) jarring at first blush, but it's easy and perfectly reliable. 7 and 15 are always going to be a 4 bit field. We do a lot of introspection via indirect things like this.
> ...

All of those things are ugly hacks. This kind of brain teaser is how metaprogramming works (or increasingly: used to work) in C++, but I think it is not very wise to continue this tradition in D.

> 
>>> The bit offset can be introspected by summing the number of bits in each preceding bitfield that has the same value of .offsetof.
>>
>> I think it would be much better to just add a `__trait` for this or add something like `.bitoffsetof`. This is a) much more user friendly and b) is a bit more likely to work reliably in practice. D currently does not give any guarantees on the order you will see members when using `__traits(allMembers, ...)`.
> 
> I overlooked that bitfields can have holes in them, so probably something like .bitoffsetof is probably necessary.

Sounds good.

April 29

Re: second draft: add Bitfields to D

Posted by Timon Gehr
in reply to Timon Gehr

Timon Gehr

Posted in reply to Timon Gehr

On 4/29/24 14:04, Timon Gehr wrote:
>>
>> Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.
>> ...
> 
> Well, you can't take a pointer to a bitfield.

Forgot to fully answer this.

I am asking for example code how you would implement a function that gives you a "fat pointer" to a bitfield that lets you read and write from that bitfield.

It cannot be the same as in C, as I think this inherently requires introspection.

April 29

Re: second draft: add Bitfields to D

Posted by Jonathan M Davis
in reply to Walter Bright

Jonathan M Davis

Posted in reply to Walter Bright

On Monday, April 29, 2024 12:44:08 AM MDT Walter Bright via dip.development wrote:
> An enum is distinguished by it not being possible to use .offsetof with it.

I don't think that I have _ever_ seen anyone use offsetof to determine anything with type introspection other than the actual offset. Existing code will almost certainly be using & to determine whether a member is an enum or not.

That being said, _usually_, it's the case that code cares when a member is an enum or not when doing type introspection, because it's looking for something else (e.g. for whether the member is a static member variable), so I don't know whether suddenly having additional members that cannot have their address taken will break anything, but any situation where there isn't a trait that outright tells you what you're looking for makes it highly likely that any existing code which needed to figure it out did so by trying out a variety of checks and found some combination of things to check for being true and some combination of things to check for being false and then did enough testing to be reasonably sure that that combination of checks told them what they needed to know, but even if they did get it right, because it's quite indirect, adding more catogories of things which could affect introspection will ultimately run a pretty high risk of breaking _something_.

There's only so much that we can do about that, but I do think that we need to be very careful about saying that X is the way to test for something and have any expectation that that's how folks are actually doing it unless that something is a specific trait from __traits or std.traits which checks for that exact thing.

- Jonathan M Davis

April 29

Re: second draft: add Bitfields to D

Posted by Jonathan M Davis
in reply to Timon Gehr

Jonathan M Davis

Posted in reply to Timon Gehr

On Monday, April 29, 2024 6:04:13 AM MDT Timon Gehr via dip.development wrote:
> On 4/29/24 08:44, Walter Bright wrote:
> > On 4/28/2024 3:30 PM, Timon Gehr wrote:
> >> However, I would still much prefer a solution that explicitly introduces the underlying `int`, `uint`, `long` and `ulong` fields, which would be the ones visible to introspection in a portable way, so that introspection code does not really need to concern itself with bitfields at all if it is not important and we do not break existing introspection libraries, such as all serialization libraries.
> >
> > I doubt introspection libraries would break.
>
> You are breaking even simple patterns like
> `foreach(ref field;s.tupleof){ }`.
>
> It would be a miracle if libraries did not break.

druntime and Phobos both specifically uses tupleof to look for the actual members of a type which take up storage space in that type and whose address can be taken. Traits such as std.traits.Fields do that and document it as such. If bitfields show up as part of tupleof, I would fully expect that to cause problems with any type introspection that operates on the member variables of a type. The breakage may be minimal in practice due to the fact that bitfields aren't currently part of the language, and it's only new code which would encounter this problem, but any existing type introspection code looking at fields is going to expect that all of those fields take up storage space and that their address can be taken, so if it's given a type which has bitfields, and those show up in tupleof, that code is not going to work correctly.

Such code does already need to take unions into account (and there is _some_ similarity between those and bitfields), but it's going to have done that by checking things like is(T == union), which won't help with bitfields at all. And really, even if bitfields matched that, you wouldn't necessarily get the right result anyway, because while both bitfields and unions have members which are not proper fields on their own, the way they behave and take up space in the type is completely different.

Maybe we should add a check for bitfields? Presumably, it would have to be something more like __traits(isBitfield, member), since unlike with a union, you can't check the type, and we're not adding a bitfields keyword, but regardless of how you'd check whether something is a bitfield, existing type introspection code is going to have to be updated in some fashion to take bitfields into account, or it's going to do the wrong thing when it's given a type that has bitfields. There's no way that bitfields are going to just magically work correctly with code that does type introspection.

It does make sense that __traits(allMembers, T) would give you the bitfields, but I don't think that it makes sense that tupleof would, since you cannot take their addresses, but either way, it _will_ break Phobos code if tupleof gives bitfields - and not in a way that would be easily detected, because doing so would require having tests that used bitfields, which of course, don't exist, because bitfields have to be added first.

- Jonathan M Davis

April 29

Re: second draft: add Bitfields to D

Posted by Walter Bright
in reply to Timon Gehr

Walter Bright

Posted in reply to Timon Gehr

On 4/29/2024 5:04 AM, Timon Gehr wrote:
> You are breaking even simple patterns like
> `foreach(ref field;s.tupleof){ }`.

Let's see what happens.

```
import core.stdc.stdio;

struct S { int a; int b; enum int c = 3; int d:3, e:4; }

void main()
{
    S s;
    s.a = 1;
    s.b = 2;
    s.d = 3;
    s.e = 4;
    foreach (ref f; s.tupleof) { printf("%d\n", f); }
}
```
which prints:
```
1
2
3
4
```
What is going on here? foreach over a tuple is not really a loop. It's a shorthand for a sequence of statements that the compiler unrolls the "loop" into. When the compiler sees a 'ref' for something that cannot have its address taken, it ignores the 'ref'. This can also be seen with:
```
foreach(ref i; 0 .. 10) { }
```
which works. You can see this in action when compiling with -vasm.

Additionally, for such unrolled loops the 'f' is not a loop variable. It is a new variable created for each unroll of the loop. You can see this with:
```
import core.stdc.stdio;

struct S { int a; int b; enum int c = 3; int d:3, e:4; }

void main()
{
    S s;
    s.a = 1;
    s.b = 2;
    s.d = 3;
    s.e = 4;
    foreach (ref f; s.tupleof) { foo(f); }
    foreach (ref f; s.tupleof) { printf("%d\n", f); }
}

void foo(ref int f)
{
    printf("f: %d\n", f);
    ++f;
}
```
where s.a and s.b get incremented, but s.d and s.e do not.

I do not recall exactly why this `ref` behavior was done for foreach, but it was either a mistake or was done precisely to make generic code work. Either way, what's done is done, and there doesn't seem to be much point in breaking it.

April 29

Re: second draft: add Bitfields to D

Posted by Walter Bright
in reply to Timon Gehr

Walter Bright

Posted in reply to Timon Gehr

On 4/29/2024 5:04 AM, Timon Gehr wrote:
>> If they are not checking for bitfields, but are just looking at .offsetof and the type, they'll interpret the bitfields as a union (which, in a way, is accurate).
>> ...
> 
> No, it is not accurate.

Getting and setting bit fields reads/writes all the bits in the underlying field, so it definitely is like a union. std.bitmanip.bitfields also implements it as a union, because there is no other way. The CPU does not provide any instructions to access bit fields. (This is why atomics won't work on bitfields.)

If the user of bitfields does not understand the underlying physical reality of bitfields, they will forever have problems with them. Just like programmers who do not understand the physical reality of pointers, floating point, 2s complement, etc., are always crippled and would probably be better off using Excel as their programming language :-/

>> Pointer to bitfields will work just the same as they do in C. I don't understand what you're asking for.
> Well, you can't take a pointer to a bitfield.

Exactly what I meant!

> Well, so far everything in `.tupleof` had an address.

When you mentioned enums not having an address, I had assumed you were talking about __traits(allMembers). .tupleof skips over enums.

> It should at least be mentioned in the DIP, if nowhere else you should put it in the breaking language changes section.

I can mention it, sure.

> Well, if you are trying to deliberately make introspection unnecessarily complicated, I guess that's your prerogative.

__traits has an ugly syntax. The idea was to provide the ability, and the user (or Phobos) would put a pretty face on it.

> All of those things are ugly hacks. This kind of brain teaser is how metaprogramming works (or increasingly: used to work) in C++, but I think it is not very wise to continue this tradition in D.

std.traits definitely continues the tradition. While I'm fine with ugly implementations in it, std.traits fails to document the behavior of the functions that supposedly put a pretty face on it. I've asked Adam Wilson to consider completely re-engineering std.traits.

As long as it is possible to put a pretty face on it, I'm ok with an underlying ugliness in the service of not having N>1 diverse ways to do X.

Top | Forum index | About this forum

Copyright © 1999-2021 by the D Language Foundation