View mode: basic / threaded / horizontal-split · Log in · Help
October 14, 2004
Re: Bits again. A proposal.
Sean Kelly wrote:

>>And that bit isn't an integer type at all, but instead a boolean type...
>>Which makes it even more puzzling why the name "bit" was chosen for it ?
> 
> Just about.  The only reason I can think to call it "bit" is that it
> implies a storage size (which is accurte in arrays).

This is not always a good thing. Sometimes "char" or "int" are 
preferrable over "bit", for implementation performance reasons.

If "bool" was kept as an *abstract* concept, the compiler could
then chose a representation that was optimal for the actual task ?

--anders
October 14, 2004
Re: Bits again. A proposal.
In article <ckmv1a$dck$1@digitaldaemon.com>,
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
>
>Sean Kelly wrote:
>
>>>And that bit isn't an integer type at all, but instead a boolean type...
>>>Which makes it even more puzzling why the name "bit" was chosen for it ?
>> 
>> Just about.  The only reason I can think to call it "bit" is that it
>> implies a storage size (which is accurte in arrays).
>
>This is not always a good thing. Sometimes "char" or "int" are 
>preferrable over "bit", for implementation performance reasons.
>
>If "bool" was kept as an *abstract* concept, the compiler could
>then chose a representation that was optimal for the actual task ?

It seems that one design aspect of D is that all primitive types have
well-defined storage attributes.  byte is 8 bits, int is 32 bits, etc.  C/C++
make no such claims for any type.  While part of me does wonder if this is going
to be a problem at some point for D (there are some rare systems where a byte is
not 8 bits), I think the overall benefit is a good one.  And as I said in the
other thread, if packing of bits in arrays were removed as a feature in D, then
the token name should change to "bool."  This is the principal difference in my
mind.


Sean
October 15, 2004
Re: Bits again. A proposal.
In article <ckn0ii$era$1@digitaldaemon.com>, Sean Kelly says...
>
>In article <ckmv1a$dck$1@digitaldaemon.com>,
>=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
>>
>>Sean Kelly wrote:
>>
>>>>And that bit isn't an integer type at all, but instead a boolean type...
>>>>Which makes it even more puzzling why the name "bit" was chosen for it ?
>>> 
>>> Just about.  The only reason I can think to call it "bit" is that it
>>> implies a storage size (which is accurate in arrays).
>>
>>This is not always a good thing. Sometimes "char" or "int" are 
>>preferrable over "bit", for implementation performance reasons.

What?  I can't think of any logic instructions using more cpu cycles than any
equivalent arithmetics.  If moving them, and they are already in 8-bit or 2^n
multiples of this groups, the compiler can optimize the same way you want to
write code to do.  If you are doing it so the code looks simpler that's not
performance.  If bits are properly supported in arrays, the compiler will hide
this complexity and it will be faster than using char or ints in place of bits.

>>
>>If "bool" was kept as an *abstract* concept, the compiler could
>>then chose a representation that was optimal for the actual task ?
>
>It seems that one design aspect of D is that all primitive types have
>well-defined storage attributes.  byte is 8 bits, int is 32 bits, etc.  C/C++
>make no such claims for any type.  While part of me does wonder if this is going
>to be a problem at some point for D (there are some rare systems where a byte is
>not 8 bits),

1, 4, 9, 10, 12, 16, 18, ... and IBM 9000 series 10-digit words with no bits.
I think we can dismiss these for our purposes - they are either obsolete or
special purpose cpus, or are likely to have too small a memory to remember D, or
C, (tiny C? - possibly).

>             I think the overall benefit is a good one.  And as I said in the
>other thread, if packing of bits in arrays were removed as a feature in D, then
>the token name should change to "bool."  This is the principal difference in my
>mind.
>
>
>Sean
>
>
What is wrong with packing of bits in arrays?  Please explain.  Flags are
essentially boolean units and are quite commonly stored 8 to a byte.  If you are
saying they should have an actual size which includes many unused bits, it seems
to me that this must be because addressing of these is not unique without a bit
number in a byte. Slicing could also require shifts, perhaps of more than a
register size.

To get a set of 16 contiguous flags I should have to have an array of two packed
structs with 8 individual bits and 8 different bit-position names?  At this
point I'd just do away with bit arrays entirely.  (I wouldn't ever do this, I
would just use logic ops to handle the bits individually and forget bit arrays.)
I can't ever see a need to have 1 bit per byte bit arrays rather than byte
arrays using only the low order bit.

There are machine instructions which test or set individual bits, the only
problem I see is addressing consistency in the addressing between bits and all
larger uniquely addressible (using only the address field of instructions) units
such as bytes, chars, longs, etc.  This cannot be done away with the way you
apparently want without losing more than we gain.

All the above is not to say that there aren't problems in the way our compiler
and other C-derived compilers handle bits and bit arrays.  Addressing has always
been a kludge.  Provide a good, self-consistent, understandable and complete
solution and the world will thank you.
October 15, 2004
Re: Bits again. A proposal.
larrycowan wrote:
> In article <ckn0ii$era$1@digitaldaemon.com>, Sean Kelly says...
>
>>            I think the overall benefit is a good one.  And as I said in the
>>other thread, if packing of bits in arrays were removed as a feature in D, then
>>the token name should change to "bool."  This is the principal difference in my
>>mind.
> 
> What is wrong with packing of bits in arrays?  Please explain.  Flags are
> essentially boolean units and are quite commonly stored 8 to a byte.  If you are
> saying they should have an actual size which includes many unused bits, it seems
> to me that this must be because addressing of these is not unique without a bit
> number in a byte. Slicing could also require shifts, perhaps of more than a
> register size.

These are the reasons.

> To get a set of 16 contiguous flags I should have to have an array of two packed
> structs with 8 individual bits and 8 different bit-position names?  At this
> point I'd just do away with bit arrays entirely.  (I wouldn't ever do this, I
> would just use logic ops to handle the bits individually and forget bit arrays.)

This was the alternative suggestion.  Move to "bool" which is always one 
byte and let a library class handle the packing when needed.

> I can't ever see a need to have 1 bit per byte bit arrays rather than byte
> arrays using only the low order bit.

Agreed.

> There are machine instructions which test or set individual bits, the only
> problem I see is addressing consistency in the addressing between bits and all
> larger uniquely addressible (using only the address field of instructions) units
> such as bytes, chars, longs, etc.  This cannot be done away with the way you
> apparently want without losing more than we gain.

Thus the quandry.  I personally don't have any strong preference for 
either side of the issue.  Packed bit arrays have the potential to make 
(robust) template code more difficult to write, they prohibit addressing 
elements in bit arrays, and they impose restrictions on slicing.  At the 
same time, they are a convenient feature, it does make logical sense to 
represent a two-state value as a single bit when possible, and it isn't 
particularly difficult for a programmer to work around the problems (I 
already do this quite effortlessly with vector<bool>).  But as I said in 
the other thread: from an idealistic perspective, is the tradeoff 
worthwhile?  Walter certainly thinks it is.  I'm undecided.  Others 
don't like it one bit :)

> All the above is not to say that there aren't problems in the way our compiler
> and other C-derived compilers handle bits and bit arrays.  Addressing has always
> been a kludge.  Provide a good, self-consistent, understandable and complete
> solution and the world will thank you.

Definately.  If there were a straightforward and robust way to address 
these few problems, I would be quite happy.


Sean
October 15, 2004
Re: Bits again. A proposal.
larrycowan wrote:
> One more time...
> 
> [...smart things...]

bit should probably be done away with entirely: the only thing that 
makes it useful at all right now is bit arrays, and they can easily be 
implemented with a struct and a few overloaded operators.

Moreover, the fact that bit arrays are packed creates all sorts of 
special cases and warts (like the behaviour of the .sizeof property), 
all for a feature that is hardly ever actually put to use! (Phobos 
itself only uses them in the sense that it implements certain bit[] 
operations, like bit[].reverse)

I would very much like a stricter boolean, for which no implicit 
conversions exist, but, either way, there isn't a very compelling reason 
to keep bit at all.

 -- andy
October 15, 2004
Re: Bits again. A proposal.
larry cowan wrote:

>>>This is not always a good thing. Sometimes "char" or "int" are 
>>>preferrable over "bit", for implementation performance reasons.
> 
> What?  I can't think of any logic instructions using more cpu cycles than any
> equivalent arithmetics.  If moving them, and they are already in 8-bit or 2^n
> multiples of this groups, the compiler can optimize the same way you want to
> write code to do.  If you are doing it so the code looks simpler that's not
> performance.  If bits are properly supported in arrays, the compiler will hide
> this complexity and it will be faster than using char or ints in place of bits.

I just know that in my GCC 3.4, sizeof(bool) equals sizeof(int) ?
In C99/C++, the compiler can choose any representation of a "bool".
(at least they standardized on a common name for it, in <stdbool.h>)

Just like an "int" is allowed to between short or long, although
a lot of new code just ignores 16-bit computers and then breaks if
sizeof(int) is not the same as sizeof(long). Thus the <stdint.h>


That being said, a "bit" looks like a perfect choice for bool if only
it can have the pointer to it taken and be implemented reasonably sane.
But if it walks like a bool and quaks like a bool, why not name it bool?

--anders
October 15, 2004
Re: Bits again. A proposal.
Andy Friesen wrote:
> larrycowan wrote:
> 
>> One more time...
>>
>> [...smart things...]
> 
> 
> bit should probably be done away with entirely: the only thing that 
> makes it useful at all right now is bit arrays, and they can easily be 
> implemented with a struct and a few overloaded operators.
> 
> Moreover, the fact that bit arrays are packed creates all sorts of 
> special cases and warts (like the behaviour of the .sizeof property), 
> all for a feature that is hardly ever actually put to use! (Phobos 
> itself only uses them in the sense that it implements certain bit[] 
> operations, like bit[].reverse)
> 
> I would very much like a stricter boolean, for which no implicit 
> conversions exist, but, either way, there isn't a very compelling reason 
> to keep bit at all.
> 
>  -- andy
?????
I can see the desire for stricter booleans, and I can see arguments 
for limiting the ways in which bit arrays can be used (perhaps they 
could be required to be allocated in groups of, say, 32).  But 
packed bit arrays are so useful that doing away with them seems...., 
well, just very undesireable.

Limit them if you must.  Make it so that slicing isn't implemented 
on them.  Make them a library class.  But don't eliminate them.

I rarely want to slice a bit array, but I frequently have need for 
one.  (One CAN get around this by masking and shifting, but that's a 
quite error-prone approach.  At least, *I* find it quite error-prone.)

OTOH, I can certainly see making it a special library class, with 
constructors that take, say, the other basic types, and methods that 
return the value as packed into an array of one (or several) of the 
other basic types.
October 15, 2004
Re: Bits again. A proposal.
Charles Hixson wrote:
> Andy Friesen wrote:
> 
> ?????
> I can see the desire for stricter booleans, and I can see arguments for 
> limiting the ways in which bit arrays can be used (perhaps they could be 
> required to be allocated in groups of, say, 32).  But packed bit arrays 
> are so useful that doing away with them seems...., well, just very 
> undesireable.
> 
> Limit them if you must.  Make it so that slicing isn't implemented on 
> them.  Make them a library class.  But don't eliminate them.
> 
> I rarely want to slice a bit array, but I frequently have need for one.  
> (One CAN get around this by masking and shifting, but that's a quite 
> error-prone approach.  At least, *I* find it quite error-prone.)
> 
> OTOH, I can certainly see making it a special library class, with 
> constructors that take, say, the other basic types, and methods that 
> return the value as packed into an array of one (or several) of the 
> other basic types.

I wasn't arguing that bitsets should be eradicated from existence.  I 
was referring merely to the fact that they are currently built into the 
core language itself. :)

It would be very easy to write a little struct that implements the 
indexing and slicing operators and behaves like bit[] in pretty much 
every way.

 -- andy
October 15, 2004
Re: Bits again. A proposal.
I earlier wrote:
> That being said, a "bit" looks like a perfect choice for bool if only
> it can have the pointer to it taken and be implemented reasonably sane.
> But if it walks like a bool and quaks like a bool, why not name it bool?

More bit / bool inconsistencies:

http://www.digitalmars.com/d/htomodule.html
> A little global search and replace will take care of renaming the C types
> to D types. The following table shows a typical mapping for 32 bit C code:
> C type 	D type
[...]
> bool 	int

http://www.digitalmars.com/d/ctod.html
> C to D types
>
>       bool               =>        bit 

So at one place, bool is an integer. Of 32-bits, none the less.
Hoping that the C compiler chose int for bool, and not char...
(the first page should probably read: bool => char or int,
just like wchar_t currently says: wchar_t => wchar or dchar)

In the other, bool is a bit (which now has a "boolean" cast operator)
So the current "bit" type is definitely a bool, being 1 bit in size.
(Ignoring whether or not it's a good thing that it converts to int)
Why the type name was changed to reflect the storage is still unclear?


A questionable feature is whether a language *needs* any sub-byte
integer types, such as e.g. bits (1 bit) and nybbles (4 bits)... ?

Possibly because they could (potentially) be useful in packed arrays
to avoid having to use any bit-operators (the nybble macros are nasty).


But D's practice of calling the boolean type "bit" is *not good*.
The sooner it can be changed, the better! It could *work* the same.

Here is one idea: rename the D keyword from "bit" back to bool again.
And then change the definition in object.d to read: "alias bool bit;"


Then all we need is some bool type-safety...

Which could be added now, and checked later ?
(i.e. start converting code, hope for D 2.0)
--anders


PS. Does anyone have any good usages of bit arrays they like to share?
    (and I am *not* talking about bool[] arrays, like in "sieve.d")
October 16, 2004
Re: Bits again. A proposal.
Anders F Björklund wrote:
> ...
> 
> But D's practice of calling the boolean type "bit" is *not good*.
> The sooner it can be changed, the better! It could *work* the same.
> 
> ...

But currently D's type IS bit.  bool is an alias for convenience 
only.  And currently bit arrays are packed, and thus bit[8] is 
equivalent to a bit addressable byte.  This can be quite useful, but 
since I don't know your boundaries about how you think of bit and 
how you think of bool, I can't claim that there isn't an overlap, 
but to my mind, if you care how it's packed, then it's a bit type, 
otherwise, bool is probably a decent label.
1 2 3 4
Top | Discussion index | About this forum | D home