Thread overview
BitArray shift left/right confusion.
Dec 27, 2017
Bastiaan Veelo
Dec 27, 2017
Biotronic
Dec 28, 2017
Bastiaan Veelo
Feb 02, 2018
Jakub Łabaj
Feb 03, 2018
Bastiaan Veelo
December 27, 2017
I suppose the following is not a bug, but confusing it is:

```
void main()
{
    import std.stdio;
    import std.bitmanip;
    BitArray ba = [1, 1, 1, 1, 1, 1, 1, 1];
    writeln(ba);	// [1, 1, 1, 1, 1, 1, 1, 1]
    ba >>= 4;		// right shift
    writeln(ba);	// [1, 1, 1, 1, 0, 0, 0, 0] bits shifted left
}```

I suppose this is because the array is printed left-to-right, whereas the bits in a byte are typically ordered right-to-left. I suppose I should interpret the bits in the array to increase in significance with increasing index (little endian) and that right-shift means a shift towards less significance (which is to the right in big endian).

The documentation of <<= and >>= [1] however just talks about left and right, without defining left and right or clarifying that the directions are reversed from how the array is printed.

Is there something I have missed?

[1] https://dlang.org/phobos/std_bitmanip.html#.BitArray.opOpAssign.2
December 27, 2017
On Wednesday, 27 December 2017 at 18:08:19 UTC, Bastiaan Veelo wrote:
> I suppose the following is not a bug, but confusing it is:
>
> ```
> void main()
> {
>     import std.stdio;
>     import std.bitmanip;
>     BitArray ba = [1, 1, 1, 1, 1, 1, 1, 1];
>     writeln(ba);	// [1, 1, 1, 1, 1, 1, 1, 1]
>     ba >>= 4;		// right shift
>     writeln(ba);	// [1, 1, 1, 1, 0, 0, 0, 0] bits shifted left
> }```
>
> I suppose this is because the array is printed left-to-right, whereas the bits in a byte are typically ordered right-to-left. I suppose I should interpret the bits in the array to increase in significance with increasing index (little endian) and that right-shift means a shift towards less significance (which is to the right in big endian).
>
> The documentation of <<= and >>= [1] however just talks about left and right, without defining left and right or clarifying that the directions are reversed from how the array is printed.
>
> Is there something I have missed?
>
> [1] https://dlang.org/phobos/std_bitmanip.html#.BitArray.opOpAssign.2

BitArray is apparently a mess. As you've pointed out it prints the bits in the wrong order. I won't mince words here, since D has binary literals on the form 0b10011110. Put that in a BitArray and print it with the format string "0b%b", and you'll get 0b01111001. While it may have been intentional, it's bug prone and confusing, and so definitely a bug.

It also fucks up royally when it has an exact multiple of 32 bits in its buffer, overwriting the last word with 0s when you try and shift it in any way.

It also doesn't remove set bits outside of its covered area when cast to size_t[]. That is, if I do cast(size_t[])(BitArray([1,1,1,1,1,1,1,1]) << 4), the result will be something like [4080], which corresponds to [0b0000_1111_1111_0000].

Lastly (and this is pointed out explicitly in the documentation, but still smells if you ask me), it will overwrite bits in the words it covers, even if it does not cover those exact bits.

The first two are definitely bugs. The last two are results of the intended use case for BitArray, I believe. The documentation doesn't explicitly point this out, but it seems BitArray is intended to give a bit-by-bit view of an area of memory that is actually some other type. Something like this:

struct S {
   int n;
   float f;
}

void foo(ref S s) {
    import std.bitmanip;
    auto a = BitArray((&s)[0..1], S.sizeof);
    a[7] = true; // Actually sets the corresponding bit in s.
}

--
  Biotronic
December 28, 2017
On Wednesday, 27 December 2017 at 20:45:49 UTC, Biotronic wrote:
> BitArray is apparently a mess.
Thanks for your confirmation, digging and reporting issues
https://issues.dlang.org/show_bug.cgi?id=18133 and
https://issues.dlang.org/show_bug.cgi?id=18134

(Turns out we both live in the same country).

February 02, 2018
I was looking at the issues you reported and was pretty surprised how some of the features work. So I'm going to revive this thread with an explanation.

On Wednesday, 27 December 2017 at 20:45:49 UTC, Biotronic wrote:
> On Wednesday, 27 December 2017 at 18:08:19 UTC, Bastiaan Veelo wrote:
>> I suppose the following is not a bug, but confusing it is:
>>
>> ```
>> void main()
>> {
>>     import std.stdio;
>>     import std.bitmanip;
>>     BitArray ba = [1, 1, 1, 1, 1, 1, 1, 1];
>>     writeln(ba);	// [1, 1, 1, 1, 1, 1, 1, 1]
>>     ba >>= 4;		// right shift
>>     writeln(ba);	// [1, 1, 1, 1, 0, 0, 0, 0] bits shifted left
>> }```
>>
>> I suppose this is because the array is printed left-to-right, whereas the bits in a byte are typically ordered right-to-left.

BitArray stores its data in a `size_t[]` array - on 64-bit it consists of 8-byte units, each unit having its beginning at the least significant bit. So it works as you would probably expect:

bool[128] bits;
auto ba = BitArray(bits); // one bool corresponds to one bit in this constructor
ba[0] = 1;
ba[127] = 1;

Now, the exact representation of the array in this BitArray is as follows:
index:  | 0          | 1          |
bit no: | 64 ... 1 0 | 63 ... 1 0 |
values: | 0  ... 0 1 | 1  ... 0 0 |

You can get this array using cast:
size_t[] exact = cast(size_t[]) ba;
assert(exact[0] == 1);
assert(exact[1] == size_t(1) << 63);

Printing works in the human, i.e. logical, way:
writeln(ba); // [1, 0, ..., 0, ..., 0, 1]

BitArray has also a constructor taking a raw representation of the underlying array:
size_t val = 0b1111_0000;
auto ba2 = BitArray([val], 8);
writeln(ba2); // [0, 0, 0, 0, 1, 1, 1, 1]

It may be surprising, as it's opposite order of the binary literal. But this works as intended because it's supposed to be the inversion of casting:
assert(cast(size_t[]) ba2 == [val]);
It also works like that e.g. in Java's BitSet and C++'s bitset.

On the other hand, shifting operators are equally confusing for me, as they are for you - they really work in the other way around! I thought this is a very weird bug, but I found this pull request: https://github.com/dlang/phobos/pull/2844, which states this is the intended behaviour.

I don't know if there is anybody that would expect this - it's inconsistent with any other bitset implementation I know from other languages, as well as with what logic suggests. I'm curious if the authors changed their minds in this regard and there is any chance for that to be rewritten?

And besides, it's pretty bugged, indeed.


February 03, 2018
On Friday, 2 February 2018 at 00:33:03 UTC, Jakub Łabaj wrote:
> On the other hand, shifting operators are equally confusing for me, as they are for you - they really work in the other way around! I thought this is a very weird bug, but I found this pull request: https://github.com/dlang/phobos/pull/2844, which states this is the intended behaviour.
>
> I don't know if there is anybody that would expect this - it's inconsistent with any other bitset implementation I know from other languages, as well as with what logic suggests. I'm curious if the authors changed their minds in this regard and there is any chance for that to be rewritten?

The same confusion existed amongst the authos, that pull was done in reaction to https://github.com/dlang/phobos/pull/2797#discussion-diff-22456052.

So the shift direction seems to be as intended (by Martin Nowak at least) but the docs should be clear in what order the bits are printed and indexed. I guess.