Thread overview
Bitfield accessors
Nov 19, 2007
bearophile
Nov 20, 2007
bearophile
Nov 21, 2007
bearophile
Nov 22, 2007
bearophile
Nov 27, 2007
bearophile
November 19, 2007
Until D gains C-style bitfields it can be created a compile-time function that returns the pairs of methods that take/return ubyte/ushort/uint/ulong, to be used by a mixin. Something like:

union TyIntFloat {
  int      i;                 // as integer
  float    f;                 // as float
  // as bit fields:
  mixin(Bitfields!("i
                   sign 1
                   biasedexponent 8
                   significand 23"));
}

Here "i" is the name of the unsigned integral variable to be splitted in bit fields.
std.string.split works at compile time too, it can be used to split the input string.
Something like that can be useful in Tango too (if not already present), but I presume it's difficult to make it optimize the code produced.

That
Bitfields!("i sign 1 biasedexponent 8 significand 23")
is supposed to return a string of code that contains six methods like:
"
uint sign()                 { return i & ...; }
void sign(uint x)           { i = ...; }
uint biasedexponent()       { return ...; }
void biasedexponent(uint x) { i = ...; }
uint significand()          { return ...; }
void significand(uint x)    { i = ...; }
"

The sum of the numbers is cheeked to be 8, 16, 32 or 64. According to such total that "i" variable and the input/output values have type as ubyte, ushort, uint or ulong.

Bye and thank you,
bearophile
November 19, 2007
"bearophile" <bearophileHUGS@lycos.com> wrote in message news:fhrpoi$1i2a$1@digitalmars.com...
> Until D gains C-style bitfields it can be created a compile-time function that returns the pairs of methods that take/return ubyte/ushort/uint/ulong, to be used by a mixin. Something like:
>
> union TyIntFloat {
>  int      i;                 // as integer
>  float    f;                 // as float
>  // as bit fields:
>  mixin(Bitfields!("i
>                   sign 1
>                   biasedexponent 8
>                   significand 23"));
> }
>
> Here "i" is the name of the unsigned integral variable to be splitted in
> bit fields.
> std.string.split works at compile time too, it can be used to split the
> input string.
> Something like that can be useful in Tango too (if not already present),
> but I presume it's difficult to make it optimize the code produced.
>
> That
> Bitfields!("i sign 1 biasedexponent 8 significand 23")
> is supposed to return a string of code that contains six methods like:
> "
> uint sign()                 { return i & ...; }
> void sign(uint x)           { i = ...; }
> uint biasedexponent()       { return ...; }
> void biasedexponent(uint x) { i = ...; }
> uint significand()          { return ...; }
> void significand(uint x)    { i = ...; }
> "
>
> The sum of the numbers is cheeked to be 8, 16, 32 or 64. According to such total that "i" variable and the input/output values have type as ubyte, ushort, uint or ulong.
>
> Bye and thank you,
> bearophile

I actually came up with this for a D OS kernel my friends and I are working on.  It looks like this:

struct SomeStruct
{
    uint flags;
    mixin(Bitfield!(flags, "sign", 1, "biasedExponent", 8, "significand",
23));
}

The implementation:

template Bitfield(alias data, Args...)
{
    static assert(!(Args.length & 1), "Bitfield arguments must be an even
number");
    const char[] Bitfield = BitfieldShim!((typeof(data)).stringof, data,
Args).Ret;
}

// Odd bug in D templates -- putting "data.stringof" as a template argument
gives it the
// string of the type, rather than the string of the symbol.  This shim
works around that.
template BitfieldShim(char[] typeStr, alias data, Args...)
{
    const char[] Name = data.stringof;
    const char[] Ret = BitfieldImpl!(typeStr, Name, 0, Args).Ret;
}

template BitfieldImpl(char[] typeStr, char[] nameStr, int offset, Args...)
{
    static if(Args.length == 0)
        const char[] Ret = "";
    else
    {
        const char[] Name = Args[0];
        const int Size = Args[1];
        const int Mask = Bitmask!(Size);

        const char[] Getter = "public " ~ typeStr ~ " " ~ Name ~ "() {
return ( " ~
            nameStr ~ " >> " ~ Itoh!(offset) ~ " ) & " ~ Itoh!(Mask) ~
"; }";

        const char[] Setter = "public void " ~ Name ~ "(" ~ typeStr ~ " val)
{ " ~
            nameStr ~ " = (" ~ nameStr ~ " & " ~ Itoh!(~(Mask << offset)) ~
") | ((val & " ~
            Itoh!(Mask) ~ ") << " ~ Itoh!(offset) ~ "); }";

        const char[] Ret = Getter ~ Setter ~ BitfieldImpl!(typeStr, nameStr,
offset + Size, Args[2 .. $]).Ret;
    }
}

template Itoa(int i)
{
    static if(i < 0)
        const char[] Itoa = "-" ~ IntToStr!(-i, 10);
    else
        const char[] Itoa = IntToStr!(i, 10);
}

template Itoh(int i)
{
    const char[] Itoh = "0x" ~ IntToStr!(i, 16);
}

template Digits(int i)
{
    const char[] Digits = "0123456789abcdefghijklmnopqrstuvwxyz"[0 .. i];
}

template IntToStr(ulong i, int base)
{
    static if(i >= base)
        const char[] IntToStr = IntToStr!(i / base, base) ~ Digits!(base)[i
% base];
    else
        const char[] IntToStr = "" ~ Digits!(base)[i % base];
}

template Bitmask(int size)
{
    const int Bitmask = (1 << size) - 1;
}


November 20, 2007
Thank you very much Jarrett Billingsley, that's even better. I'll use a modified version of it in my code (with your name attached). Lately I tend to use compile-time functions more than templates, because I think they are a bit more readable. I have added another test, on the sum of the lens to see if they are equal to data.sizeof*8, because I like to avoid bugs all the times it's possible:

template Bitfield(alias data, Args...) {
    static assert(!(Args.length & 1), "Bitfield arguments must be an even number");
    static assert(data.sizeof*8 == _SumLens!(Args),
                  "The sum of bit fields is different from data.sizeof*8");
    const char[] Bitfield = _BitfieldShim!((typeof(data)).stringof, data, Args).Ret;
}

private template _SumLens(Args...) {
    static if(Args.length == 0)
        const uint _SumLens = 0;
    else
        const uint _SumLens = Args[1] + _SumLens!(Args[2 .. $]);
}

I have also added a newline at the end of Getter and Setter, so the methods become listed in separated lines and you can see them better if you print the whole string result:

const char[] Getter = "public " ~ typeStr ~ " " ~ Name ~ "() { return ( " ~
    nameStr ~ " >> " ~ Itoh!(offset) ~ " ) & " ~ Itoh!(Mask) ~ "; }\n";

const char[] Setter = "public void " ~ Name ~ "(" ~ typeStr ~ " val) { " ~
    nameStr ~ " = (" ~ nameStr ~ " & " ~ Itoh!(~(Mask << offset)) ~
    ") | ((val & " ~ Itoh!(Mask) ~ ") << " ~ Itoh!(offset) ~ "); }\n";

Currently I am writing the module tests for Bitfield(), to see if it works when data is a ubyte/ushort/ulong too.
I don't know if I'll need to make it optimize some the produced code, probably the compiler is enough to optimize it in most cases.
I'd also like to do some speed comparison with the bitfields management of GCC.

Bye,
bearophile
November 20, 2007
"bearophile" <bearophileHUGS@lycos.com> wrote in message news:fhufar$2hu$1@digitalmars.com...
> Thank you very much Jarrett Billingsley, that's even better. I'll use a modified version of it in my code (with your name attached). Lately I tend to use compile-time functions more than templates, because I think they are a bit more readable. I have added another test, on the sum of the lens to see if they are equal to data.sizeof*8, because I like to avoid bugs all the times it's possible:

Cool :)

> Currently I am writing the module tests for Bitfield(), to see if it works
> when data is a ubyte/ushort/ulong too.
> I don't know if I'll need to make it optimize some the produced code,
> probably the compiler is enough to optimize it in most cases.
> I'd also like to do some speed comparison with the bitfields management of
> GCC.

If you're using it in a struct, I'd be very surprised if the call weren't optimized out, in which case I'd expect the performance to be about the same as C bitfields.


November 21, 2007
Jarrett Billingsley:

> Cool :)

But later I have changed it again. I have changed the first part of the Bitfield code to this, because sometimes you don't need to access the whole data:

template Bitfield(alias data, Args...) {
    static assert(!(Args.length & 1), "Bitfield arguments must be an even number");
    static assert(data.sizeof*8 >= _SumLens!(Args),
                  "The sum of bit fields is bigger than data.sizeof*8");
    const char[] Bitfield = _BitfieldShim!((typeof(data)).stringof, data, Args).Ret;
}

private template _SumLens(Args...) {
    static if(Args.length == 0)
        const uint _SumLens = 0;
    else {
        static assert(Args[1] > 0, "Bitfield '" ~ Args[0] ~ "' is <= 0");
        const uint _SumLens = Args[1] + _SumLens!(Args[2 .. $]);
    }
}


> If you're using it in a struct, I'd be very surprised if the call weren't optimized out, in which case I'd expect the performance to be about the same as C bitfields.

The calls become inlined, but the results may surprise you anyway :-) For the speed tests I have used this simple D code on a very old PC:

import std.stdio;

// Bitfield code here...

struct INTORFLOAT {
    uint i;
    mixin(Bitfield!(i, "sign", 1, "biasedexponent", 8, "significand", 23));
}

void main() {
    ulong tot;
    INTORFLOAT f;
    const N = 25;

    uint v[N] = [1628676761, 1620574103, 1237153253, 1098880307, 87513741,
                 13181925, 14686126, 7429435, 16286706, 6474381,
                 4879794, 7734725, 3745958, 13353858, 4236193,
                 7587, 4309, 28846, 7313, 14516,
                 126, 143, 171, 221, 156];

    for(int j = 0; j < 10_000_000; j++)
        for(int i = 0; i < N; i++) {
            f.i = v[i];
            f.biasedexponent = f.biasedexponent / 2;
            tot += f.biasedexponent + f.sign;
            //printf("%d %d\n", f.i, tot);
        }

   printf("%d\n", tot);
}


The C code:

#include <stdio.h>

#define N 25

typedef union {
    unsigned int i;
    struct {
        unsigned int sign: 1;
        unsigned int biasedexponent: 8;
        unsigned int significand: 23;
    } bits;
} INTORFLOAT;

int main(void) {
    int i, j;
    unsigned long long tot = 0;
    INTORFLOAT f;

    unsigned int v[N] = {1628676761, 1620574103, 1237153253, 1098880307, 87513741,
                         13181925, 14686126, 7429435, 16286706, 6474381,
                         4879794, 7734725, 3745958, 13353858, 4236193,
                         7587, 4309, 28846, 7313, 14516,
                         126, 143, 171, 221, 156};

    for(j = 0; j < 10000000; j++)
        for(i = 0; i < N; i++) {
            f.i = v[i];
            f.bits.biasedexponent = f.bits.biasedexponent / 2; // line1
            tot += f.bits.biasedexponent + f.bits.sign; // line2
            //printf("%d %d\n", f.i, tot);
        }

   printf("%d\n", tot);
}


I have compiled them with:
DMD: v1.023
dmd -O -inline -release bitfields1.d

gcc: V. 3.4.2 (mingw-special)
gcc -O3 -s bitfields1.c -o bitfields1


Bitfield produces the following getters and setters:
public uint sign() { return (i >> 0x0) & 0x1; }
public void sign(uint val) { i = (i & 0xfffffffffffffffe) | ((val & 0x1) << 0x0); }
public uint biasedexponent() { return (i >> 0x1) & 0xff; }
public void biasedexponent(uint val) { i = (i & 0xfffffffffffffe01) | ((val & 0xff) << 0x1); }
public uint significand() { return (i >> 0x9) & 0x7fffff; }
public void significand(uint val) { i = (i & 0x1ff) | ((val & 0x7fffff) << 0x9); }

The C code runs about 2.5 times faster than the D code (on a faster CPU the difference is probably smaller). (The numerical output of the code is fortunately the same!). I have tested that the methods (member functions) actually become inlined.
Removing line1 and line2 from both the C and D code the running time becomes (much smaller and) equal between D and C, so I presume the D slowdown is just in the reading/writing of the bit fields.

The Bitfield code can be improved (but such improvement of the templates has resulted hairy, because the code has to adapt to data of different sizeof) to produce a method like this:

public void biasedexponent(uint val) { i = (i & 0xfffffe01) | ((val & 0xff) << 0x1); }

With that the running time of the D code becomes just about 2 times the C code.


Just for reference the ASM code of the following C code:

#include <stdio.h>

#define N 25

typedef union {
    unsigned int i;
    struct {
        unsigned int sign: 1;
        unsigned int biasedexponent: 8;
        unsigned int significand: 23;
    } bits;
} INTORFLOAT;

int main(void) {
    int i;
    unsigned long tot = 0;
    INTORFLOAT f;

    unsigned int v[N] = {1628676761, 1620574103, 1237153253, 1098880307, 87513741,
                         13181925, 14686126, 7429435, 16286706, 6474381,
                         4879794, 7734725, 3745958, 13353858, 4236193,
                         7587, 4309, 28846, 7313, 14516,
                         126, 143, 171, 221, 156};

    for(i = 0; i < N; i++) {
        f.i = v[i];
        f.bits.biasedexponent = f.bits.biasedexponent / 2; // line1
        tot += f.bits.biasedexponent + f.bits.sign; // line2
    }

   printf("%d\n", tot);
}


Produced by:
gcc -O3 -S bitfields2.c -o bitfields2.asm

Is:

	.file	"bitfields2.c"
	.def	___main;	.scl	2;	.type	32;	.endef
	.data
	.align 4
LC0:
	.long	1628676761
	.long	1620574103
	.long	1237153253
	.long	1098880307
	.long	87513741
	.long	13181925
	.long	14686126
	.long	7429435
	.long	16286706
	.long	6474381
	.long	4879794
	.long	7734725
	.long	3745958
	.long	13353858
	.long	4236193
	.long	7587
	.long	4309
	.long	28846
	.long	7313
	.long	14516
	.long	126
	.long	143
	.long	171
	.long	221
	.long	156
	.section .rdata,"dr"
LC1:
	.ascii "%d\12\0"
	.text
	.p2align 4,,15
.globl _main
	.def	_main;	.scl	2;	.type	32;	.endef
_main:
	pushl	%ebp
	movl	$16, %eax
	movl	%esp, %ebp
	pushl	%ebx
	subl	$132, %esp
	andl	$-16, %esp
	call	__alloca
	call	___main
	movl	$100, %ecx
	leal	-120(%ebp), %eax
	movl	$LC0, %edx
	movl	%ecx, 8(%esp)
	xorl	%ebx, %ebx
	movl	%edx, 4(%esp)
	movl	%eax, (%esp)
	call	_memcpy
	xorl	%ecx, %ecx
	.p2align 4,,15
L5:
	movl	-120(%ebp,%ecx,4), %edx
	incl	%ecx
	movl	%edx, %eax
	shrl	$2, %eax
	andl	$1, %edx
	andl	$127, %eax
	addl	%edx, %eax
	addl	%eax, %ebx
	cmpl	$24, %ecx
	jle	L5
	movl	%ebx, 4(%esp)
	movl	$LC1, (%esp)
	call	_printf
	movl	-4(%ebp), %ebx
	leave
	ret
	.def	_memcpy;	.scl	3;	.type	32;	.endef
	.def	_printf;	.scl	3;	.type	32;	.endef

I have no good way to see the ASM generated by dmd yet (I may buy Walter's utility). (Maybe some of you can show that asm, but remember to remove the outer j loop as in the C version).

My obvious conclusion is that I hope to see built-in bitfields in D ;-)
The D docs say they aren't used often, but now and then I translate C code to D and I don't want to write many (slow) getter and setter properties like that :-)

Bye,
bearophile
November 22, 2007
"bearophile" <bearophileHUGS@lycos.com> wrote in message news:fi0054$2q9t$1@digitalmars.com...

> I have compiled them with:
> DMD: v1.023
> dmd -O -inline -release bitfields1.d
>
> gcc: V. 3.4.2 (mingw-special)
> gcc -O3 -s bitfields1.c -o bitfields1
>
> ...
> Just for reference the ASM code of the following C code:
> ...
> I have no good way to see the ASM generated by dmd yet (I may buy Walter's
> utility). (Maybe some of you can show that asm, but remember to remove the
> outer j loop as in the C version).

This isn't an entirely unbiased comparison.  If you used dmd and dmc, or gcc and gdc, that would take out at least one very large variable: that dmd and gcc have two completely different backends and optimizers.  From anecdotal evidence I'd say the gcc (and by proxy, gdc) backend is far more mature than dmd's when it comes to optimization.

> My obvious conclusion is that I hope to see built-in bitfields in D ;-) The D docs say they aren't used often, but now and then I translate C code to D and I don't want to write many (slow) getter and setter properties like that :-)

I agree.  Bitfields are "seldom used" but when you need them for system programming (which is what D is supposed to be for!) they're an awful pain to have to emulate.


November 22, 2007
Jarrett Billingsley:
> This isn't an entirely unbiased comparison.  If you used dmd and dmc, or gcc and gdc, that would take out at least one very large variable: that dmd and gcc have two completely different backends and optimizers.  From anecdotal evidence I'd say the gcc (and by proxy, gdc) backend is far more mature than dmd's when it comes to optimization.

I have tested what I use. I'll try gdc if I succed its installation, but I think its install procedure is more complex.

If you look at the ShedSkin installer I have helped for, you can see the not-win version is just the Python files and the C++ classes, while the Windows version is a WinRar-created (very compressed) "installer" that contains the same things plus the whole necessary parts of MinGW alredy set to go. Installing ShedSkin on Win is 1 click or little more. I presume dmd on Win can reach the same level of user-frieldlness... :-)

http://sourceforge.net/project/showfiles.php?group_id=142788

Bye,
bearophile
November 27, 2007
For reference, this is my last version so far, I think this solves both a bug, and makes it faster at runtime, but it needs much more testing:

// By Jarrett Billingsley <kb3ctd2@yahoo.com>
// Improved by leonardo maffi, V1.1, Nov 26 2007
// This code needs much more testing

template Bitfield(alias data, Args...) {
    static assert(!(Args.length & 1), "Bitfield arguments must be an even number");
    static assert(data.sizeof*8 >= _SumLens!(Args),
                  "The sum of bit fields is bigger than data.sizeof*8");
    const char[] Bitfield = _BitfieldShim!((typeof(data)).stringof, data, Args).Ret;
}

private template _SumLens(Args...) {
    static if(Args.length == 0)
        const uint _SumLens = 0;
    else {
        static assert(Args[1] > 0, "Bitfield '" ~ Args[0] ~ "' is <= 0");
        const uint _SumLens = Args[1] + _SumLens!(Args[2 .. $]);
    }
}

// Odd bug in D templates -- putting "data.stringof" as a template argument
// gives it the string of the type, rather than the string of the symbol.
// This shim works around that.
private template _BitfieldShim(char[] typeStr, alias data, Args...) {
    const char[] Name = data.stringof;
    const char[] Ret = _BitfieldImpl!(data, typeStr, Name, 0, Args).Ret;
}

private template _BitfieldImpl(alias data, char[] typeStr, char[] nameStr, int offset, Args...) {
    static if(Args.length == 0)
        const char[] Ret = "";
    else {
        const char[] Name = Args[0];
        const int Size = Args[1];

        static if (data.sizeof < 8) {
            const int Mask = _bitmask32(Size);
            const char[] Shift1 = "0x" ~ _intToStr32(~(Mask << offset), 16);
        } else {
            const long Mask = _bitmask64(Size);
            const char[] Shift1 = "0x" ~ _intToStr64(~(Mask << offset), 16);
        }

        const char[] Getter = "public " ~ typeStr ~ " " ~ Name ~ "() { return (" ~
            nameStr ~ " >> " ~ _itoh(offset) ~ ") & " ~ _itoh(Mask) ~ "; }\n";

        const char[] Setter = "public void " ~ Name ~ "(" ~ typeStr ~ " val) { " ~
            nameStr ~ " = (" ~ nameStr ~ " & " ~ Shift1 ~
            ") | ((val & " ~ _itoh(Mask) ~ ") << " ~ _itoh(offset) ~ "); }\n\n";

        const char[] Ret = Getter ~ Setter ~
                           _BitfieldImpl!(data, typeStr, nameStr, offset + Size, Args[2 .. $]).Ret;
    }
}

private int _bitmask32(int size) {
    return (1 << size) - 1;
}

private long _bitmask64(long size) {
    return (1L << size) - 1;
}

private char[] _itoa(long i) {
    if (i < 0)
        return "-" ~ _intToStr64(-i, 10);
    else
        return _intToStr64(i, 10);
}

private char[] _itoh(long i) {
    return "0x" ~ _intToStr64(i, 16);
}

private char[] _intToStr32(uint i, int base) {
    const CONV_CHARS = "0123456789abcdefghijklmnopqrstuvwxyz";
    if (i >= base)
        return _intToStr32(i / base, base) ~ CONV_CHARS[i % base];
    else
        return "" ~ CONV_CHARS[i % base];
}

private char[] _intToStr64(ulong i, int base) {
    const CONV_CHARS = "0123456789abcdefghijklmnopqrstuvwxyz";
    if (i >= base)
        return _intToStr64(i / base, base) ~ CONV_CHARS[i % base];
    else
        return "" ~ CONV_CHARS[i % base];
}

//=======================================================================

struct SomeStruct {
    uint flags;
    mixin(Bitfield!(flags, "sign", 1, "biasedExponent", 8, "significand", 23));
}

import std.stdio;

void main() {
    uint flags;
    char[] r1a = Bitfield!(flags, "sign", 1, "biasedExponent", 8, "significand", 23);
    char[] r1b =
"public uint sign() { return (flags >> 0x0) & 0x1; }
public void sign(uint val) { flags = (flags & 0xfffffffe) | ((val & 0x1) << 0x0); }

public uint biasedExponent() { return (flags >> 0x1) & 0xff; }
public void biasedExponent(uint val) { flags = (flags & 0xfffffe01) | ((val & 0xff) << 0x1); }

public uint significand() { return (flags >> 0x9) & 0x7fffff; }
public void significand(uint val) { flags = (flags & 0x1ff) | ((val & 0x7fffff) << 0x9); }

";
    assert(r1a == r1b);

    //writefln( Bitfield!(flags, "sign", 1, "biasedExponent", 8, "significand", 24) ); // raises error

    ubyte otto;
    char[] r2a = Bitfield!(otto, "sign", 1, "seven", 7);
    char[] r2b =
"public ubyte sign() { return (otto >> 0x0) & 0x1; }
public void sign(ubyte val) { otto = (otto & 0xfffffffe) | ((val & 0x1) << 0x0); }

public ubyte seven() { return (otto >> 0x1) & 0x7f; }
public void seven(ubyte val) { otto = (otto & 0xffffff01) | ((val & 0x7f) << 0x1); }

";
    assert(r2a == r2b);

    ubyte d1;
    writefln(Bitfield!(d1, "f1", 3, "s1", 5));

    ushort d2;
    writefln(Bitfield!(d2, "f2", 9, "s2", 7));

    uint d3;
    writefln(Bitfield!(d3, "f3", 17, "s3", 15));

    ulong d4;
    writefln(Bitfield!(d4, "f4", 31, "s4", 33));
}

Bye,
bearophile