October 14, 2004
Sean Kelly wrote:

>>And that bit isn't an integer type at all, but instead a boolean type...
>>Which makes it even more puzzling why the name "bit" was chosen for it ?
> 
> Just about.  The only reason I can think to call it "bit" is that it
> implies a storage size (which is accurte in arrays).

This is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons.

If "bool" was kept as an *abstract* concept, the compiler could
then chose a representation that was optimal for the actual task ?

--anders
October 14, 2004
In article <ckmv1a$dck$1@digitaldaemon.com>, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
>
>Sean Kelly wrote:
>
>>>And that bit isn't an integer type at all, but instead a boolean type... Which makes it even more puzzling why the name "bit" was chosen for it ?
>> 
>> Just about.  The only reason I can think to call it "bit" is that it implies a storage size (which is accurte in arrays).
>
>This is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons.
>
>If "bool" was kept as an *abstract* concept, the compiler could then chose a representation that was optimal for the actual task ?

It seems that one design aspect of D is that all primitive types have well-defined storage attributes.  byte is 8 bits, int is 32 bits, etc.  C/C++ make no such claims for any type.  While part of me does wonder if this is going to be a problem at some point for D (there are some rare systems where a byte is not 8 bits), I think the overall benefit is a good one.  And as I said in the other thread, if packing of bits in arrays were removed as a feature in D, then the token name should change to "bool."  This is the principal difference in my mind.


Sean


October 15, 2004
In article <ckn0ii$era$1@digitaldaemon.com>, Sean Kelly says...
>
>In article <ckmv1a$dck$1@digitaldaemon.com>, =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
>>
>>Sean Kelly wrote:
>>
>>>>And that bit isn't an integer type at all, but instead a boolean type... Which makes it even more puzzling why the name "bit" was chosen for it ?
>>> 
>>> Just about.  The only reason I can think to call it "bit" is that it implies a storage size (which is accurate in arrays).
>>
>>This is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons.

What?  I can't think of any logic instructions using more cpu cycles than any equivalent arithmetics.  If moving them, and they are already in 8-bit or 2^n multiples of this groups, the compiler can optimize the same way you want to write code to do.  If you are doing it so the code looks simpler that's not performance.  If bits are properly supported in arrays, the compiler will hide this complexity and it will be faster than using char or ints in place of bits.

>>
>>If "bool" was kept as an *abstract* concept, the compiler could then chose a representation that was optimal for the actual task ?
>
>It seems that one design aspect of D is that all primitive types have well-defined storage attributes.  byte is 8 bits, int is 32 bits, etc.  C/C++ make no such claims for any type.  While part of me does wonder if this is going to be a problem at some point for D (there are some rare systems where a byte is not 8 bits),

1, 4, 9, 10, 12, 16, 18, ... and IBM 9000 series 10-digit words with no bits.
I think we can dismiss these for our purposes - they are either obsolete or
special purpose cpus, or are likely to have too small a memory to remember D, or
C, (tiny C? - possibly).

>             I think the overall benefit is a good one.  And as I said in the
>other thread, if packing of bits in arrays were removed as a feature in D, then the token name should change to "bool."  This is the principal difference in my mind.
>
>
>Sean
>
>
What is wrong with packing of bits in arrays?  Please explain.  Flags are essentially boolean units and are quite commonly stored 8 to a byte.  If you are saying they should have an actual size which includes many unused bits, it seems to me that this must be because addressing of these is not unique without a bit number in a byte. Slicing could also require shifts, perhaps of more than a register size.

To get a set of 16 contiguous flags I should have to have an array of two packed structs with 8 individual bits and 8 different bit-position names?  At this point I'd just do away with bit arrays entirely.  (I wouldn't ever do this, I would just use logic ops to handle the bits individually and forget bit arrays.) I can't ever see a need to have 1 bit per byte bit arrays rather than byte arrays using only the low order bit.

There are machine instructions which test or set individual bits, the only problem I see is addressing consistency in the addressing between bits and all larger uniquely addressible (using only the address field of instructions) units such as bytes, chars, longs, etc.  This cannot be done away with the way you apparently want without losing more than we gain.

All the above is not to say that there aren't problems in the way our compiler and other C-derived compilers handle bits and bit arrays.  Addressing has always been a kludge.  Provide a good, self-consistent, understandable and complete solution and the world will thank you.



October 15, 2004
larrycowan wrote:
> In article <ckn0ii$era$1@digitaldaemon.com>, Sean Kelly says...
>
>>            I think the overall benefit is a good one.  And as I said in the
>>other thread, if packing of bits in arrays were removed as a feature in D, then
>>the token name should change to "bool."  This is the principal difference in my
>>mind.
> 
> What is wrong with packing of bits in arrays?  Please explain.  Flags are
> essentially boolean units and are quite commonly stored 8 to a byte.  If you are
> saying they should have an actual size which includes many unused bits, it seems
> to me that this must be because addressing of these is not unique without a bit
> number in a byte. Slicing could also require shifts, perhaps of more than a
> register size.

These are the reasons.

> To get a set of 16 contiguous flags I should have to have an array of two packed
> structs with 8 individual bits and 8 different bit-position names?  At this
> point I'd just do away with bit arrays entirely.  (I wouldn't ever do this, I
> would just use logic ops to handle the bits individually and forget bit arrays.)

This was the alternative suggestion.  Move to "bool" which is always one byte and let a library class handle the packing when needed.

> I can't ever see a need to have 1 bit per byte bit arrays rather than byte
> arrays using only the low order bit.

Agreed.

> There are machine instructions which test or set individual bits, the only
> problem I see is addressing consistency in the addressing between bits and all
> larger uniquely addressible (using only the address field of instructions) units
> such as bytes, chars, longs, etc.  This cannot be done away with the way you
> apparently want without losing more than we gain.

Thus the quandry.  I personally don't have any strong preference for either side of the issue.  Packed bit arrays have the potential to make (robust) template code more difficult to write, they prohibit addressing elements in bit arrays, and they impose restrictions on slicing.  At the same time, they are a convenient feature, it does make logical sense to represent a two-state value as a single bit when possible, and it isn't particularly difficult for a programmer to work around the problems (I already do this quite effortlessly with vector<bool>).  But as I said in the other thread: from an idealistic perspective, is the tradeoff worthwhile?  Walter certainly thinks it is.  I'm undecided.  Others don't like it one bit :)

> All the above is not to say that there aren't problems in the way our compiler
> and other C-derived compilers handle bits and bit arrays.  Addressing has always
> been a kludge.  Provide a good, self-consistent, understandable and complete
> solution and the world will thank you.

Definately.  If there were a straightforward and robust way to address these few problems, I would be quite happy.


Sean
October 15, 2004
larrycowan wrote:
> One more time...
> 
> [...smart things...]

bit should probably be done away with entirely: the only thing that makes it useful at all right now is bit arrays, and they can easily be implemented with a struct and a few overloaded operators.

Moreover, the fact that bit arrays are packed creates all sorts of special cases and warts (like the behaviour of the .sizeof property), all for a feature that is hardly ever actually put to use! (Phobos itself only uses them in the sense that it implements certain bit[] operations, like bit[].reverse)

I would very much like a stricter boolean, for which no implicit conversions exist, but, either way, there isn't a very compelling reason to keep bit at all.

 -- andy
October 15, 2004
larry cowan wrote:

>>>This is not always a good thing. Sometimes "char" or "int" are preferrable over "bit", for implementation performance reasons.
> 
> What?  I can't think of any logic instructions using more cpu cycles than any
> equivalent arithmetics.  If moving them, and they are already in 8-bit or 2^n
> multiples of this groups, the compiler can optimize the same way you want to
> write code to do.  If you are doing it so the code looks simpler that's not
> performance.  If bits are properly supported in arrays, the compiler will hide
> this complexity and it will be faster than using char or ints in place of bits.

I just know that in my GCC 3.4, sizeof(bool) equals sizeof(int) ?
In C99/C++, the compiler can choose any representation of a "bool".
(at least they standardized on a common name for it, in <stdbool.h>)

Just like an "int" is allowed to between short or long, although
a lot of new code just ignores 16-bit computers and then breaks if
sizeof(int) is not the same as sizeof(long). Thus the <stdint.h>


That being said, a "bit" looks like a perfect choice for bool if only
it can have the pointer to it taken and be implemented reasonably sane.
But if it walks like a bool and quaks like a bool, why not name it bool?

--anders
October 15, 2004
Andy Friesen wrote:
> larrycowan wrote:
> 
>> One more time...
>>
>> [...smart things...]
> 
> 
> bit should probably be done away with entirely: the only thing that makes it useful at all right now is bit arrays, and they can easily be implemented with a struct and a few overloaded operators.
> 
> Moreover, the fact that bit arrays are packed creates all sorts of special cases and warts (like the behaviour of the .sizeof property), all for a feature that is hardly ever actually put to use! (Phobos itself only uses them in the sense that it implements certain bit[] operations, like bit[].reverse)
> 
> I would very much like a stricter boolean, for which no implicit conversions exist, but, either way, there isn't a very compelling reason to keep bit at all.
> 
>  -- andy
?????
I can see the desire for stricter booleans, and I can see arguments for limiting the ways in which bit arrays can be used (perhaps they could be required to be allocated in groups of, say, 32).  But packed bit arrays are so useful that doing away with them seems...., well, just very undesireable.

Limit them if you must.  Make it so that slicing isn't implemented on them.  Make them a library class.  But don't eliminate them.

I rarely want to slice a bit array, but I frequently have need for one.  (One CAN get around this by masking and shifting, but that's a quite error-prone approach.  At least, *I* find it quite error-prone.)

OTOH, I can certainly see making it a special library class, with constructors that take, say, the other basic types, and methods that return the value as packed into an array of one (or several) of the other basic types.
October 15, 2004
Charles Hixson wrote:
> Andy Friesen wrote:
> 
> ?????
> I can see the desire for stricter booleans, and I can see arguments for limiting the ways in which bit arrays can be used (perhaps they could be required to be allocated in groups of, say, 32).  But packed bit arrays are so useful that doing away with them seems...., well, just very undesireable.
> 
> Limit them if you must.  Make it so that slicing isn't implemented on them.  Make them a library class.  But don't eliminate them.
> 
> I rarely want to slice a bit array, but I frequently have need for one.  (One CAN get around this by masking and shifting, but that's a quite error-prone approach.  At least, *I* find it quite error-prone.)
> 
> OTOH, I can certainly see making it a special library class, with constructors that take, say, the other basic types, and methods that return the value as packed into an array of one (or several) of the other basic types.

I wasn't arguing that bitsets should be eradicated from existence.  I was referring merely to the fact that they are currently built into the core language itself. :)

It would be very easy to write a little struct that implements the indexing and slicing operators and behaves like bit[] in pretty much every way.

 -- andy
October 15, 2004
I earlier wrote:
> That being said, a "bit" looks like a perfect choice for bool if only
> it can have the pointer to it taken and be implemented reasonably sane.
> But if it walks like a bool and quaks like a bool, why not name it bool?

More bit / bool inconsistencies:

http://www.digitalmars.com/d/htomodule.html
> A little global search and replace will take care of renaming the C types
> to D types. The following table shows a typical mapping for 32 bit C code:
> C type 	D type
[...]
> bool 	int

http://www.digitalmars.com/d/ctod.html
> C to D types
>
>       bool               =>        bit 

So at one place, bool is an integer. Of 32-bits, none the less.
Hoping that the C compiler chose int for bool, and not char...
(the first page should probably read: bool => char or int,
just like wchar_t currently says: wchar_t => wchar or dchar)

In the other, bool is a bit (which now has a "boolean" cast operator)
So the current "bit" type is definitely a bool, being 1 bit in size.
(Ignoring whether or not it's a good thing that it converts to int)
Why the type name was changed to reflect the storage is still unclear?


A questionable feature is whether a language *needs* any sub-byte
integer types, such as e.g. bits (1 bit) and nybbles (4 bits)... ?

Possibly because they could (potentially) be useful in packed arrays
to avoid having to use any bit-operators (the nybble macros are nasty).


But D's practice of calling the boolean type "bit" is *not good*.
The sooner it can be changed, the better! It could *work* the same.

Here is one idea: rename the D keyword from "bit" back to bool again.
And then change the definition in object.d to read: "alias bool bit;"


Then all we need is some bool type-safety...

Which could be added now, and checked later ?
(i.e. start converting code, hope for D 2.0)
--anders


PS. Does anyone have any good usages of bit arrays they like to share?
    (and I am *not* talking about bool[] arrays, like in "sieve.d")
October 16, 2004
Anders F Björklund wrote:
> ...
> 
> But D's practice of calling the boolean type "bit" is *not good*.
> The sooner it can be changed, the better! It could *work* the same.
> 
> ...

But currently D's type IS bit.  bool is an alias for convenience only.  And currently bit arrays are packed, and thus bit[8] is equivalent to a bit addressable byte.  This can be quite useful, but since I don't know your boundaries about how you think of bit and how you think of bool, I can't claim that there isn't an overlap, but to my mind, if you care how it's packed, then it's a bit type, otherwise, bool is probably a decent label.