June 05, 2004
Sean Kelly wrote:

>What does the person who just wants a plain old vector of boolean values do?
>  
>
Lots of things. Make an array of enums.

-- 
-Anderson: http://badmama.com.au/~anderson/
June 05, 2004
throwing an exception when taking the address of a bit is unacceptible. this can be detected at compile time since we know the type info, no?

compile time errors folks :-)

personally I'm in favor of abolishing bit in the following manner:
make bit a temporary alias to bool.
make it size 1 (byte) or 4 (int)
most things (except bit packing) will still work before it's thrown out...
then just axe it and have a pakced_vector!(bool) in the D template lib.

I would say also compile time error when trying to bitslice a bit array.



In article <c9s72o$1108$1@digitaldaemon.com>, Arcane Jill says...
>
>In article <slrncc26jr.n9h.jsykari@pulu.hut.fi>, Antti =?iso-8859-1?Q?Syk=E4ri?= says...
>>
>>Boolean types are basic types and should be reasoned about in the same way as any other type. This includes doing things like:
>
>Much as I would love to agree with you(and I would - I really would, because I want a working bool type as much as anyone), your points are not actually points in favor of making bool=int, they are points in favor of FIXING THE BUGS IN "bit". (Obviously I agree with you on both counts though).
>
>If we are going to have a type, bit, which behaves like a primative type, then it must be properly implemented. It must be possible to take the address of a bit. Pointer arithmetic must work. Slicing, both by copy and by reference, must work, even on non-byte boundaries.
>
>There is a way that this can be done easily - but I suspect that a lot of people won't like it. The D compiler needs to take these three steps (and they are not trivial).
>
>(1) The property bit.sizeof() must return a float or a double, not an int, with
>value 0.125.
>
>(2) Taking the address of a bit must return a new type, a "bit pointer". The simplest way to implement this is as double. Doubles can store a superset of the values that uints can store, and in addition they are capable of handling those 0.125 fractional parts. Example, bit 0 at byte-address 1193046 would have bit-address 1193046.0; bit 1 at the same byte-address would have bit-address 1193046.0125, and so on.
>
>(3) Given that currently, slices consist of a struct containing a pointer and a length, so bit-slices must consist of a struct containing a bit-pointer as described above, and a length.
>
>Walter may be able to think of other ways of achieving this, as he's a smart guy. But the only reasonable alternative to unambiguously succeeding (when taking the address of a bit), is throwing an exception. Returning a byte-address can only ever lead to undefined behaviour. Same goes for bitslicing on not-necessarily-byte boundaries.
>
>Arcane Jill
>
>
>


June 06, 2004
On Sat, 5 Jun 2004 19:11:33 +0000 (UTC), hellcatv@hotmail.com wrote:
>personally I'm in favor of abolishing bit in the following manner:

I'd also like to see the bit type completely obliterated. The only really justifiable use for a bit type (that I'm aware of) is when the bits are packed into an array. So, instead of having an actual bit type, I'd like to see a packedbit type.

Of course, just like everyone else here, I'd like to see a real boolean type. I think the combination of boolean and packedbit would solve most people's real-life needs.
June 06, 2004
In article <c9sp85$1s1t$1@digitaldaemon.com>, Sean Kelly says...
>
>In article <c9rsm1$f7f$1@digitaldaemon.com>, Kevin Bealer says...
>>
>>
>>1. Your example fails to compile, and fails with a reasonable message, so there is no language bug there.
>>
>>2. The bit/bit array objects are a *feature*.  All the behaviours that that you and the other anti-bit folks want in a boolean ARE ALREADY AVAILABLE in other types.  You can use typedef if you are afraid of automatic conversion.
>>
>>3. You argue that it is hard to understand.  Well tough.
>
>But what about template code?  Say I write a template function that manipulates arrays.  Why should bool[] be a special case that the function can't operate on? As I said in another message, this is the exact same issue as the vector<bool> specialization in C++.  And because of the contention surrounding vector<bool> I don't expect this to be settled amicably either.

If the template does one of the unsupported actions, the instantiation will fail to compile with an appropriate error.  If you don't do anything forbidden (you avoid passing references to array elements) it will work.

If it fails for bit vectors, you need to decide based on the purpose of the code; either:

1. The template writer specializes an internal template interface for bit array (I don't know if this can be done), and provides another solution.

2. The template writer creates a nested struct like: "struct foo { T x; }" and thus wraps the type in a safe manner.

3. The user uses "int" or "byte" for the "yes/no" value.

If you want a "packed bit vector", then you must understand that it is not byte addressable so certain things fail.

I can sympathize with the template argument, but the value of packed bit arrays to me overrides the "aesthetic" issue that there is more asymmetry.


>>Bit arrays aren't needed everywhere.  Like C, C++, and the professional paint sprayer, they are a power tool.  If you want a simple tool with a different interface, don't use them.  But getting this functionality to run fast IS very important for a lot of applications, so it should be in the language, not a library.
>
>It shouldn't matter whether a feature is in the language or in its standard library, so long as the reason for the choice makes sense.  Adding a feature to the language allows for better error handling and integration, but this is not always necessary nor desirable.  You say that bit arrays aren't needed everywhere, but by integrating them into the language you require that they be used everywhere whether you want them or not.  What does the person who just wants a plain old vector of boolean values do?

Just use int[] or byte[], according to taste.

Yes, I realize that it doesn't self-document as much as if it said "bool".  But why have "bit" if it has no special behavior?


>>Between enums, the dozens of integer types, and structures, the functionality you want is available.  You just have to accept that for "int" semantics, you want the int type.
>
>So why should bool be different?  It's a primitive type that can be assigned to and compared.  It can be stored in arrays, etc.  But taking the address of a bool when it's in an array is illegal?
>
>I should back up and say that I am playing devils' advocate to a degree here. This problem wouldn't affect me either way because I know the language well enough to avoid it.  But because of the vector<bool> issue I know that it confuse other people.  And while I'm typically inclined to say RTFM, I'm not sure that such a special case should be built into the language rather than provided as a library feature.
>
>Sean

Because it can probably be done more efficiently, when it comes to bit slicing for example.  Although, I don't know the performance details, so maybe it actually makes no difference.

Also, int->bit converts with "?:" semantics, not "& 1" semantics, which makes it a special case among the integer types.  Most would agree that this is good, assuming they like the ability to convert at all.

I guess when someone says "this is hard to learn", I'm inclined to make them reach for it.  bit[] and char[] and wchar[] are all special: bit can't take addresses of elements, and char[] and wchar[] have special case conversion to larger unicode char sizes.  dchar[] probably converts down, but I'm not sure.

And why are hash tables the only associative array type?  I'd like an ordered set as well.  Why does ".sort" use a qsort/heapsort technique?  There are other algorithms, it would be neat to be able to choose one.  I could probably go on, if I thought for a minute.  All these choices could be second-guessed, but in each case the existing method is at least pretty good.

And I'd be happy to second guess at least some of them, and probably will do so over time (although not the examples I gave above).  But after 100 messages, I would also be willing to let it die.

Kevin



June 06, 2004
In article <cnr4c0debnt9d1ug1e3gjjvmtls5o8qce8@4ax.com>, Benji Smith says...
>
>On Sat, 5 Jun 2004 19:11:33 +0000 (UTC), hellcatv@hotmail.com wrote:
>>personally I'm in favor of abolishing bit in the following manner:
>
>I'd also like to see the bit type completely obliterated. The only really justifiable use for a bit type (that I'm aware of) is when the bits are packed into an array. So, instead of having an actual bit type, I'd like to see a packedbit type.

I never go into the coffee shop on 23rd street, so lets burn it down.

You don't need it, but how does it hurt you?

Kevin

>Of course, just like everyone else here, I'd like to see a real boolean type. I think the combination of boolean and packedbit would solve most people's real-life needs.


June 06, 2004
In article <slrncc26jr.n9h.jsykari@pulu.hut.fi>, Antti Sykäri wrote:
> void main()
> {
>     bool[16] x;
>     catastrophe1(x[5]);
>     catastrophe2(&x[5]);
> 
>     bool* first_bool = &x[0];
>     bool* second_bool = &x[1];
>     assert(second_bool == first_bool + 1);

>     assert(cast(int) second_bool = cast(int) first_bool + bool.size);

Whopps, that one should be ==

I _do_ test my code before I post it, but this time I only fixed in the real code, not the code in the article :)

-A

> }

June 06, 2004
Arcane Jill:
> Much as I would love to agree with you(and I would - I really would, because I want a working bool type as much as anyone), your points are not actually points in favor of making bool=int, they are points in favor of FIXING THE BUGS IN "bit".

More like points in favor of removing "bit" ;)

Kevin Bealer:
> I find it particularly telling that the post ends in a vague discussion of "lets define a real type", but then mentions a dozen or so conflicting strategies, including multiple bool types of different sizes.  If Occam heard that, he would probably cut himself shaving.

I'm not really making any suggestions whether bool should be int, short or byte, and probably it doesn't really matter.  All that matters from the language designer point of view is:

1) sane semantics (!= returns bool, if () requires bool, etc.), ability
   to overload on a boolean argument, ability to treat it as an lvalue
   like other types (for example by slicing it and taking address of a
   bool value)

and, less importantly,

2) micro-efficiency (issues with the size of the boolean type, and its
   implementation - which must be ultimately resolved by creating
   different implementations of bool and actually TESTING them on a big
   amount of real-world code)

Now, we should be thinking mainly about (1). Use a dummy implementation of bool (e.g. enum bool : uint { false = 0, true = 1 }) and GET THE SEMANTICS RIGHT. Premature optimization is... you know damn right.

About (1), I'm just saying that it's best that the boolean type isn't "bit" as it currently exists in D. In fact, bit isn't very good basic type in the first place, so it might as well be discarded altogether. On the other hand, if we had a "real" (byte-addressable) boolean type, maybe bit could serve as a special-case type for the special-case need of creating bit arrays (in fact, it would be a fine replacement of C's bitfields.)

As for the point (2) -- as a incurable performance-fanatic, I also made some tests which are in no way conclusive.

I wrote a little test program to try out the performance of the following types as booleans:

- bit
- enum { _false, _true }
- int (as in 0/!0)
- short
- ubyte

The results show that 0/1 versions -- bit and enum -- seem to be a
little faster (3%) because of relative slowness of generating a truth
value of the equality of two integers (the expensive !(a-b) operation).
This with P4/2.8GHz machine and optimized (but no -inline).

I didn't look at the generated assembly because I haven't found a disassembler for windows which could print out the labels of the (template) functions as well. (Suggestions will be gladly taken.) And I think DMD doesn't have an option to print out assembly either.

Also, the test was performed completely unscientifically and in no way reflects how things are done in actual code.

In actuality, I suspect, much of the boolean information is held in the processor's registers and flags and the optimizer is pretty much free to do anything with them, like mark some register as having a complement of some truth value if it's faster to compute than the actual value. So in fact the performances of the two types of boolean might be much closer to each other.

-Antti

-- 
I will not be using Plan 9 in the creation of weapons of mass destruction to be used by nations other than the US.
June 07, 2004
In article <c9s72o$1108$1@digitaldaemon.com>, Arcane Jill wrote:
> If we are going to have a type, bit, which behaves like a primative type, then it must be properly implemented. It must be possible to take the address of a bit. Pointer arithmetic must work. Slicing, both by copy and by reference, must work, even on non-byte boundaries.
> 
> There is a way that this can be done easily - but I suspect that a lot of people won't like it. The D compiler needs to take these three steps (and they are not trivial).
> 
> (1) The property bit.sizeof() must return a float or a double, not an int, with
> value 0.125.
> 
> (2) Taking the address of a bit must return a new type, a "bit pointer". The simplest way to implement this is as double. Doubles can store a superset of the values that uints can store, and in addition they are capable of handling those 0.125 fractional parts. Example, bit 0 at byte-address 1193046 would have bit-address 1193046.0; bit 1 at the same byte-address would have bit-address 1193046.0125, and so on.
> 
> (3) Given that currently, slices consist of a struct containing a pointer and a length, so bit-slices must consist of a struct containing a bit-pointer as described above, and a length.

This is insane. (But insane in a good way.) The other idea was to have the pointer to the byte and index of the bit, but I suppose a double solves the problem as well.

Only problems being the different sizes of T* and bit* (with pretty much any T), and different sizes of slices-in-general and slices-to-bits. So you couldn't cast bit* to a void* and back without losing information. (Or then void* would have to be a 64-bit number, which probably won't happen.)

Oh yeah, and 64-bit architectures.

Something reminds me of segment horrors of the 16-bit age.

Maybe Walter already has plans for this?

-Antti


-- 
I will not be using Plan 9 in the creation of weapons of mass destruction to be used by nations other than the US.
June 07, 2004
In article <slrncc7c3b.pd5.jsykari@pulu.hut.fi>, Antti =?iso-8859-1?Q?Syk=E4ri?= says...

>Only problems being the different sizes of T* and bit* (with pretty much any T), and different sizes of slices-in-general and slices-to-bits. So you couldn't cast bit* to a void* and back without losing information.

This has always been the case in C++. In C++, if you do this:

>        // given char * p;
>        int * q = (int *) p;
>        char * r = (char *) q;

there is no guarantee that r will equal p. The compiler is free to ensure that, for example, (int *)s are int-aligned, by zeroing the low order bits. While this doesn't happen on Windows or Linux, there do exist architectures for which this behavior would make perfect sense.

Maybe D doesn't do this right now. But if D is ever ported onto, say, a Starcore 8102 based architecture, it may have no choice.

But none of this matters, for one simple reason. If the code contains an EXPLICIT cast, then the compiler is free to assume that the programmer knows what they are doing, and that they are doing it deliberately. That's what an explicit cast MEANS. In this circumstance, it won't be the compiler's fault if things break, it will be the programmer's.

Jill


June 07, 2004
Well it seems that one insane idea of mine (bit-pointers) cannot work. Such a beast would conflict with the D ABI (nice new document there!) which clearly and permanently lays down the law on the format for dynamic arrays.

So you can't have bit-pointers. That means that nothing involving taking the address of a bit can ever work. Nor can passing a bit to a function as an inout parameter. Nor can bitslicing on non-byte boundaries.

I didn't seriously expect anyone to implement bit-pointers, but the fact is, without them, the bit type is disfunctional. It can never behave as other types. So maybe, as others have suggested, it might be time to let it go. The bit type was a brilliant concept, and if it could have been made to work that would truly have been a feature and an eighth. But now we've been there, tried it, done that, and bought the T-shirt. Maybe now we should try for something different.

No-one disputes that we need a packed bit array - it's just the bit /itself/ which is causing the problem. To be honest, I think that all we really need is something which does this:

class BitArray
>       {
>           this(uint n);                   // assign the first 32 bits
>           this(ulong n);                  // assign the first 64 bits
>           this(void* p, uint byteLen);    // assign any number of bits
>
>           uint toUint();
>           ulong toUlong();
>           void write(void* p, uint byteLen);
>
>           bool opIndex(uint i);           // whatever bool is aliased to
>           int opIndex(uint i, bool value);
>           int opIndexAssign(uint i, bool value);
>
>           int opEquals(Object o);
>
>           BitArray opSlice(uint i, uint j);
>           BitArray opSliceAssign(uint i, uint j, BitArray a);
>           BitArray opCat(BitArray b);
>           BitArray opCatAssign(BitArray b);
>           BitArray opAnd(BitArray b);
>           BitArray opAndAssign(BitArray b);
>           BitArray opOr(BitArray b);
>           BitArray opOrAssign(BitArray b);
>           BitArray opXor(BitArray b);
>           BitArray opXorAssign(BitArray b);
>           BitArray opCom();
>           BitArray opShl(uint n);
>           BitArray opShlAssign(uint n);
>           BitArray opShr(uint n);
>           BitArray opShrAssign(uint n);
>
>           uint length();
>           uint length(uint newLength);
>           BitArray dup();
>       }

(which, incidently, is more functionality that bit[] arrays have at the moment). That's pretty basic, but it'd be pretty easy to write. I could probably do that in a day.

Do we need the bit - at all? I mean, given that pointer semantics are never going to work like they do for other types even IF the bugs are fixed. Plus it'd be good for Walter (sort of) - no bits; no complaints. Templates will work equally well for all types. Problem solved.

Someone had to try to introduce the bit, and Walter had the guts to do it. I'm proud of him for that. But when things don't work out, sometimes you just have to call it a day.

I don't want to start yet ANOTHER round of what a bit/bool should or shouldn't be. Just a simple thought this time ... is it time to ditch the bit now? Is it time for a feature unrequest?

I'd like to know the mood of the D users here generally.

Arcane Jill