H. S. Teoh
Posted in reply to Liam McGillivray
| On Sat, Mar 16, 2024 at 09:16:51PM +0000, Liam McGillivray via Digitalmars-d-learn wrote:
> On Friday, 15 March 2024 at 00:21:42 UTC, H. S. Teoh wrote:
[...]
> > When dealing with units of data smaller than a byte, you generally need to do it manually, because memory is not addressable by individual bits, making it difficult to implement things like slicing an array of bool.
[...]
> I'm curious as to what "manual implementation" would mean, since clearly making my own struct with `bool[3]` doesn't count. Does D have features for precise memory manipulation?
Manual implementation as in you would deal with the machine representation in terms of bytes, or more likely, uints (on modern CPUs even though bytes are individually addressible, the hardware actually works in terms of a larger unit, typically an 4-byte 32-bit unit, or an 8-byte 64-bit unit), using bitwise operators to manipulate the bits the way you want to.
> Anyway, I'm surprised that D has a special operator `&=` for doing bit manipulation on integers, especially given that the steps to convert an int into a bool array is more complicated. I would imagine the former would be a rather niche thing.
You should understand that bitwise operators are directly implemented in hardware, and thus operators like &, |, ^, <<, >>, ~, etc., typically map directly to individual CPU instructions. As such, they are very fast, and preferred when you're doing bit-level manipulations. At this level, you typically do not work with individual bits per se, but with machine words (typically 32-bit or 64-bit units). Bitwise operators operate on all 32 or 64 bits at once, so performance-aware code typically manipulates all these bits simultaneously rather than individually. Of course, using suitable bit-masking you *can* address individual bits, but the hardware instructions themselves typically work with all 32/64 bits at once.
Here's a simple example. Suppose you have 3 bits you want to store. Since the machine doesn't have a 3-bit built-in type, you typically just use the next larger available size, either a ubyte (8 bits) if you want compact storage, or if compactness isn't a big issue just a uint (32 bits, you just ignore the other 29 bits that you don't need). So you'd declare the storage something like this:
uint myBits;
Bits are usually indexed from 0, so bit 0 is the first position, bit 1 is the second position, and so on. So to set the first bit to 1, you'd do:
myBits |= 0b001;
Note that at the machine level, this operator works on all 32 bits at the same time. Most of the bits remain unchanged, though, because bitwise OR does not change the original value if the operand is 0. So the overall effect is that the first bit is set.
To set the first bit to 0, there isn't a direct operator that does that, but you can take advantage of the behaviour of bitwise AND, in which any bit which is 0 in the operand will get cleared, everything else remains unchanged. So you'd do this:
myBits &= 0b110;
Now, since we don't really care about the other 29 bits, we could write this as follows instead, to make our intent clearer:
myBits &= ~0b001;
The ~ operator flips all the bits, so this is equivalent to writing:
myBits &= ~0b1111_1111_1111_1111_1111_1111_1111_1110;
Writing it with ~ also has the advantage that should we later decide to add another bit to our "bit array", we don't have to update the code; whereas if we'd used `myBits &= 0b110;` then we'd need to change it to `myBits &= 0b1110;` otherwise our new 4th bit may get unexpectedly cleared when we only wanted to clear the first bit.
Now, what if we wanted to set both the 1st and 3rd bits? In a hypothetical bit array implementation, we'd do the equivalent of:
bool[3] myBits;
myBits[0] = 1;
myBits[2] = 1;
However, in our uint approach, we can cut the number of operations by half, because the CPU is already operating on the entire 32 bits of the uint at once -- so there's no need to have two instructions to set two individual bits when we could just do it all in one:
myBits |= 0b101; // look, ma! both bits set at once!
Similarly, to clear the 1st and 3rd bits simultaneously, we simply write:
myBits &= ~0b101; // clear both bits in 1 instruction!
Of course, when we only have 3 bits to work with, the savings isn't that significant. However, if you have a larger bit array, say you need an array of 32 bits, this can speed your code up by 32x, because you're taking advantage of the fact that the hardware is already operating on all 32 bits at the same time. On 64-bit CPUs, you can speed it up by 64x because the CPU operates on all 64 bits simultaneously, so you can manipulate an entire array of 64 bits in a single instruction, which is 64x faster than if you looped over an array of bool with 64 iterations.
T
--
Without outlines, life would be pointless.
|